Kenneth Reitz, the greatest engineer in the Python field, is out of business again!

Source: Internet
Author: User

Python programmers, especially those who do reptiles, know that the HTTP request Library requests,requests perfectly embodies the meaning of the word "for humans".

Its author is a high-value photography enthusiast Kennethreitz, Kennethreitz wrote a lot of libraries, in addition to requests, and pipenv, a better integration of package management and environmental management tools. Date Time library Maya and so on.

These two days, he's got a new project called requests-html,html Parsing for humans link: https://github.com/kennethreitz/requests-html, as the name implies, it is used to parse of the HTML document. The Star of the project in just two days has been over 3000

We used to write reptiles, parsing HTML pages will usually choose BeautifulSoup or lxml library, although the BeautifulSoup API is relatively friendly, but its parsing performance is low, and lxml use XPath syntax, parsing speed, but the code is not readable Sex, now Kennethreitz out of this HTML parsing library inherits the fine tradition of the requests library-for humans.

We know that requests is only responsible for the network request, but does not parse the response results, you can interpret the requests-html as a requsts library that can parse the HTML document.

Requests-html's code is actually very small, currently less than 200 lines, are based on the existing framework for two times the package, making it easier for developers to invoke. It relies on pyquery, requests, lxml and other libraries.

Installation

pip install requests-html

How to use

>>> from requests_html import session# 返回一个Response对象>>> r = session.get(‘https://python.org/‘)
Get all Links
>>> r.html.links{‘/users/membership/‘, ‘/about/gettingstarted/‘}# 使用css选择器的方式获取某个元素>>> about = r.html.find(‘#about‘)[0]>>> print(about.text)AboutApplicationsQuotesGetting StartedHelpPython Brochure

Another very appealing feature is the ability to convert HTML to markdown text

# 将html转换为Markdown文本>>> print(about.markdown)* [About](/about/)* [Applications](/about/apps/)* [Quotes](/about/quotes/)* [Getting Started](/about/gettingstarted/)* [Help](/about/help/)* [Python Brochure](http://brochure.getpython.info/)

In fact, through the study of Python, small series also experienced a lot, although easy to get started, but the difficulty of the advanced! As an experienced, small compiled a number of learning materials, I hope that the single-digit learning to help!
Need to be small partners can sweep the QR code below, or directly add the number: kele22558!

Kenneth Reitz, the greatest engineer in the Python field, is out of business again!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.