A small example of fetching HTML information from a Web page with a python program

Source: Internet
Author: User

This article mainly introduces the use of Python program to crawl the HTML information of a small example, the use of the method is also the basis for the use of Python to write reptiles, the need for friends can refer to the

There are a number of ideas to crawl Web data, generally: Direct code request HTTP, Analog browser request data (usually require login verification), control browser to achieve data capture. This article does not consider the complexity of the case, put a read simple Web page data Small example:

Target data

Save the hyperlinks to all of these contestants on this page of the ITTF Web site.

Data request

Really like the library of human thinking, such as requests, if you want to directly take the page text, a word to fix:

?

1 doc = requests.get (URL). text

Parse HTML to get data

Take BeautifulSoup as an example, including obtaining tags, links, and traversal based on HTML hierarchies. See here for reference. The following fragment, from the ITTF Web site, gets a link to the specified location on the specified page.

?

1 2 3 4 5 6 7 8 9 a URL = ' http://www.ittf.com/ittf_ranking/WR_Table_3_A2.asp? Age_category_1=&age_category_2=&age_category_3=&age_category_4=&age_category_5=&category = 100w&cont=&country=&gender=w&month1=4&year1=2015&s_player_name=&formv_wr_table_3_ Page= ' +str (page) doc = requests.get (URL). Text soup = BeautifulSoup (doc) atags = Soup.find_all (' a ') Rank_link_pre = ' http:/ /www.ittf.com/ittf_ranking/'   mlfile = open (Linkfile, ' a ') for Atag in atags: #print atag if Atag!=none and Atag.get ( ' href ')!= none:if "wr_table_3_a2_details.asp" in atag[' href ']: link = rank_link_pre + atag[' href '] links.append (link) ml File.write (link+ ' n ') print ' Fetch link: ' +link mlfile.close ()

        Note < : More Wonderful tutorials please focus on the triple Programming

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.