Add JustHTML library to README.md #2818
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is this Python project?
JustHTML is a dependency-free, pure python, html5 parser. That means it takes a string of html, and returns a python tree structure, that you can then query and manipulate.
Comparison (A brief comparison explaining how it differs from existing alternatives.)
See comparison table.
What's the difference between this Python project and similar ones?
It's the only html5 parser available in python that passes all html5 tests. It is very well tested, with 100% test coverage, fuzz testing done.
It's fast enough, parses Wikipedia's homepage in 0.1s. Rust and C parsers are of course faster, but not as correct, and tricky to install.
It has a very nice query API, where you pass in a CSS selector and get back all elements that match that query.
--
Anyone who agrees with this pull request could submit an Approve review to it.