Python BeautifulSoup4 User Guide, beautifulsoup
Preface:
Yesterday, the legendary BeautifulSoup4 was installed, and no children's shoes have been installed. Please refer to my previous blog:
Install BeautifulSoup in Python3 Win7You can install BeautifulSoup following the simple steps in it. It is very simple, and the table is afraid.
After installing BeautifulSoup4, let's enjoy this bowl of BeautifulSoup. Haha
Entry:
BeautifulSoup is a Python library that can extract data from HTML or XML files. it enables you to navigate, search, and modify documents by using your favorite converter. beautiful Soup helps you save hours or even days of work
Before you know that BeautifulSoup is available, if you extract what you want from a text, you should use the re module, however, if you want to extract some key information from an HTML file and use the re module, you can extract the information. However, you may have racked your brains to learn a lot of materials. Now, with BeautifulSoup, everything becomes super simple, right, that's so simple.
Practice:
The best way to learn bs4 is to view bs4 official documents. There is a Chinese version. Click here.Official documentationIt looks like it will soon take me about an afternoon to read the official bs4 documents and write the sample program in it. If you don't have much time, take a look at the code below, and you will soon get started. Believe me (* ^_^ *)
_ Author _ = 'mrchen' from bs4 import BeautifulSoup # This is an example of html_doc = "
How to Use python + beautifulsoup to obtain the Chinese alt information in img
Hello:
See the following code:
From bs4 import BeautifulSouphtml = "your address" soup = BeautifulSoup (html) trs = soup. findAll ("img") length = len (trs) for I in range (length): print trs [I]. attrs ["alt"] Remember to adopt it!
Python uses BeautifulSoup to parse html Problems
Use these two parameters: findAll ('div ', {'class': 'content '})