GitHub Python's Reptile tool __python

Source: Internet
Author: User
Network relatedUniversal Urllib-Network library (standard library) requests-Network library grab-network library (PYCURL) Pycurl-Network library (with Libcurl binding) URLLIB3-with thread-safe connection pool, file Psot support, high-availability python HTTP Library httplib2-network library Robobrowser-a simple, pythonic library that can access a Web page without a standalone browser mechanicalsoup-a Python library that completes automated site interactions mechanize-stateful, programmable web Page Browsing Library. Socket-Underlying network interface (standard library) Unirest for Python-a set of lightweight HTTP libraries that support multiple languages Hyper-python HTTP/2 client pysocks-socksipy continuously updated and maintained versions, indicating bug fixes and Some other functionality that can be used as a substitute for the socket module asynchronous TREQ-HTTP client/server based on twisted, API Aiohttp-asyncio similar to requests (PEP-3156) Web crawler FrameworkUniversal Crawler Grab-Web crawler framework (based on Pycurl/multicurl) Scrapy-web crawler framework (based on twisted) Pyspider-A powerful reptile system cola-a distributed crawler framework other Portia-based on SCR APY's Visual crawler Restkit-python the HTTP repository. Allow shadow Tigers to simply access HTTP resources and use them to create a project Demiurge-a miniature crawler framework based on Pyquery Html/xml ResolutionUniversal lxml-efficient html/xml processing library. Supports XPath, written in C language Cssselect-parse DOM tree and CSS selector pyquery-parse DOM tree and jquery selector Beautifulsoup-python write inefficient html/xml processing library Html5lib-based on WH The ATWG specification generates the DOM of the Html/xml document. The WHATWG specification is now a browser's current norm Feedparser-parsing rss/atom information flow Markupsafe-python xml/html/xhtml Security Escape string Tool Xmltodict- Let you deal with XML as you do with JSON Xhtml2pdf-html/css to PDF Converter Untangle-translating XML documents into Python projects to simplify processing Hodor-supporting configuration-driven packaging tools for lxml and Cssselect Clean up bleach-clear HTML (requirement html5lib) sanitize-Restore the messy data world Text Processing

Parse and manipulate text library General difflib -Differential Computing tool (Python standard library) levenshtein -Fast Computing edit distance and string similarity fuzzywuzzy -fuzzy string matching esmre& Nbsp;-the regular expression accelerator. ftfy -reduce the fragmentation conversion of Unicode text unidecode -Unicode into ASCII text character encoding uniout -output the transfer string as readable chardet - Python 2/3 compatible character encoding detector xpinyin -the library pangu.py -CJK and alphanumeric text spacing format slug awesome-slugify - Python slugify library that preserves Unicode python-slugify -speak Unicode to ASCII Python slugify library unicode-slugify -Unicode Slugs generation tool pytils -a gadget that handles Russian strings (contains pytils.translit.slugify) generic parser ply -Python Lex and YACC parsing tools pyparsing - Common frame names for generating parsers python-nameparser -name resolution component number phonenumbers -process, format, store, verify global Phone number user agent string python-user-agents  -Browser User Agent parser HTTP Agent parser -python http proxy parser fake-useragent -python user agent spoofing based on global browser statistics user_agent& nbsp;-User agent Data Generator Special format processing

The library of the

Processing special character format for tablib -handles XLS, CSV, JSON, Yaml, and other forms of tabular data textract -extracts text from any document, supports Word, PowerPoint, PDF, etc. messy tables -Messy Tabular Data parsing rows -supports versatile and aesthetically pleasing form data processors in multiple formats (existing CSV, HTML, XLS, TXT--will support more) Office python-docx -reading, querying and modify Microsoft Word 2007/2008 docx file xlwt / xlrd -read and write data and format information from Excel xlsxwriter -for wearing Excel. The Python module of the xlsx file xlwings -a BSD-licensed library that Excel and Python call each other simpler openpyxl -can read, edit Excel 2010xlsx/xlsm/xltx/ Library of Xltm files marmir -extract python data structure and convert it into a library of tables PDF pdfminer -A tool to extract information from a PDF document pypdf2 -a library that splits, merges, and converts PDF files

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.