mirror of
https://github.com/vinta/awesome-python.git
synced 2026-05-07 00:14:48 -05:00
[PR #2818] [MERGED] Add JustHTML library to README.md #11159
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/vinta/awesome-python/pull/2818
Author: @EmilStenstrom
Created: 12/9/2025
Status: ✅ Merged
Merged: 1/8/2026
Merged by: @vinta
Base:
master← Head:patch-1📝 Commits (1)
c976bf8Add JustHTML library to README.md📊 Changes
1 file changed (+1 additions, -0 deletions)
View changed files
📝
README.md(+1 -0)📄 Description
What is this Python project?
JustHTML is a dependency-free, pure python, html5 parser. That means it takes a string of html, and returns a python tree structure, that you can then query and manipulate.
Comparison (A brief comparison explaining how it differs from existing alternatives.)
See comparison table.
What's the difference between this Python project and similar ones?
It's the only html5 parser available in python that passes all html5 tests. It is very well tested, with 100% test coverage, fuzz testing done.
It's fast enough, parses Wikipedia's homepage in 0.1s. Rust and C parsers are of course faster, but not as correct, and tricky to install.
It has a very nice query API, where you pass in a CSS selector and get back all elements that match that query.
--
Anyone who agrees with this pull request could submit an Approve review to it.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.