[PR #2818] [MERGED] Add JustHTML library to README.md #11159

Closed
opened 2026-04-24 06:00:10 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/vinta/awesome-python/pull/2818
Author: @EmilStenstrom
Created: 12/9/2025
Status: Merged
Merged: 1/8/2026
Merged by: @vinta

Base: masterHead: patch-1


📝 Commits (1)

  • c976bf8 Add JustHTML library to README.md

📊 Changes

1 file changed (+1 additions, -0 deletions)

View changed files

📝 README.md (+1 -0)

📄 Description

What is this Python project?

JustHTML is a dependency-free, pure python, html5 parser. That means it takes a string of html, and returns a python tree structure, that you can then query and manipulate.

Comparison (A brief comparison explaining how it differs from existing alternatives.)

See comparison table.

What's the difference between this Python project and similar ones?

It's the only html5 parser available in python that passes all html5 tests. It is very well tested, with 100% test coverage, fuzz testing done.

It's fast enough, parses Wikipedia's homepage in 0.1s. Rust and C parsers are of course faster, but not as correct, and tricky to install.

It has a very nice query API, where you pass in a CSS selector and get back all elements that match that query.

--

Anyone who agrees with this pull request could submit an Approve review to it.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/vinta/awesome-python/pull/2818 **Author:** [@EmilStenstrom](https://github.com/EmilStenstrom) **Created:** 12/9/2025 **Status:** ✅ Merged **Merged:** 1/8/2026 **Merged by:** [@vinta](https://github.com/vinta) **Base:** `master` ← **Head:** `patch-1` --- ### 📝 Commits (1) - [`c976bf8`](https://github.com/vinta/awesome-python/commit/c976bf8061f8c876ca6704db6f70808784e0ce74) Add JustHTML library to README.md ### 📊 Changes **1 file changed** (+1 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `README.md` (+1 -0) </details> ### 📄 Description ## What is this Python project? [JustHTML ](https://github.com/EmilStenstrom/justhtml/) is a dependency-free, pure python, html5 parser. That means it takes a string of html, and returns a python tree structure, that you can then query and manipulate. ## Comparison (A brief comparison explaining how it differs from existing alternatives.) See [comparison table](https://github.com/EmilStenstrom/justhtml/?tab=readme-ov-file#comparison-to-other-parsers). ## What's the difference between this Python project and similar ones? It's the only html5 parser available in python that passes all html5 tests. It is very well tested, with 100% test coverage, fuzz testing done. It's fast enough, parses Wikipedia's homepage in 0.1s. Rust and C parsers are of course faster, but not as correct, and tricky to install. It has a very nice query API, where you pass in a CSS selector and get back all elements that match that query. -- Anyone who agrees with this pull request could submit an *Approve* review to it. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-24 06:00:10 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/awesome-python#11159