to resolve the URL. (UTM or Mark)
Developer: Sachin Philip MathewMore information: Https://github.com/sachinvettithanam/beautifier
Ftfy
Ftfy (fixes text for your) takes in bad Unicode outputs good Unicode. Basically, it fixes all the junk characters. “quotesâ€x9d becomes "quotes"; Uìˆbecomesü;
Ftfy (fixes text for you) translates the messy Unicode into recogn
.
esmre– the regular expression accelerator.
ftfy– automatically organizes Unicode text to reduce fragmentation.
Transformation
unidecode– convert Unicode text to ASCII.
Character encoding
uniout– prints readable characters instead of escaped strings.
chardet– is compatible with Python's 2/3 character encoder.
xpinyin– a library to convert Chinese characters to pinyin.
pan
accelerator.
ftfy– automatically organizes Unicode text to reduce fragmentation.
Transformation
unidecode– convert Unicode text to ASCII.
Character encoding
uniout– prints readable characters instead of escaped strings.
chardet– is compatible with Python's 2/3 character encoder.
xpinyin– a library to convert Chinese characters to pinyin.
pangu.py– the spacing between CJK and
Converter Untangle-translating XML documents into Python projects to simplify processing Hodor-supporting configuration-driven packaging tools for lxml and Cssselect Clean up bleach-clear HTML (requirement html5lib) sanitize-Restore the messy data world
Text Processing
Parse and manipulate text library General difflib-Differential Computing tool (Python standard library) levenshtein-Fast Computing edit distance and string similarity fuzzywuzzy-fuzzy string matching esmre Nbsp;-the regular exp
a secure escape string for xml/html/xhtml.
xmltodict– A Python module that allows you to feel like you are working with JSON when working with XML.
xhtml2pdf– convert Html/css to PDF.
The untangle– easily transforms an XML file into a Python object.
Clean
bleach– Clean up HTML (requires html5lib).
Sanitize– brings clarity to the chaotic world of data.
Text ProcessingA library for parsing and manipulating simple text.
General
difflib–
library for processing time and dates. The inspiration comes from Moment.js. pytime– an easy-to-use python module for manipulating date/time through strings. pytz– modern and historical version of the world time zone definition. Bring the time zone database into Python. when.py– provides user-friendly functions to help users with the usual date and time operations. Text Processing
The The library used to parse and manipulate the text. Universal chardet– character encoding detector, compatible
-generate the DOM of the HTML/XML document according to the WHATWG specification. This specification is used in all browsers.
Feedparser-parse RSS/ATOM feeds.
MarkupSafe-provides secure escape strings for XML, HTML, and XHTML.
Xmltodict-a Python module that makes you feel like processing JSON when processing XML.
Xhtml2pdf-convert HTML/CSS to PDF.
Untangle-it is easy to convert an XML file into a Python object.
Clear
Bleach-clear HTML (html5lib is required ).
Sanitize-brings c
. The WHATWG specification is now the browser's pass specification
Feedparser-parsing Rss/atom information flow
Markupsafe-python's xml/html/xhtml Secure escape string tool
Xmltodict-let you work with XML just as you do with JSON
Xhtml2pdf-html/css to PDF Converter
Untangle-translate XML documents into Python projects to simplify processing difficulties
Hodor-Configuration-driven wrapper tool that supports lxml and Cssselect
Clean
Bleach-Clean HTML (demand html5li
, pure Python implementation.
html5lib– generates the DOM of the Html/xml document based on the WHATWG specification. This specification is used on all browsers now.
feedparser– Parse Rss/atom feeds.
markupsafe– provides a safe escape string for xml/html/xhtml.
xmltodict– A Python module that lets you feel like you're working with JSON when you're working with XML.
xhtml2pdf– converts html/css to PDF.
untangle– Easy Implementation converts an XML file into a Python object.
Clean
bleach–
Contact Us
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.