Beautiful soup is a library of Python, and the main function is to fetch data from a Web page. The following article mainly introduces the Python crawler HTML text parsing library BeautifulSoup related data, the article introduced in very detailed, for everyone has a certain reference learning value, the need for frien
Python Crawler Basics1. Get Web page textGets the HTML text content of the Web page and returns it from the URLLIB2 package, based on the URL#Coding:utf-8Importrequests, JSON, time, RE, OS, sys, timeImportUrllib2#set to Utf-8 modeReload (SYS) sys.setdefaultencoding ("Utf-8" )defgethtml (URL): Response=urllib2.urlopen (URL) HT
Class Myparser (Htmlparser): def __init__ (self,key): self.data=[] self.key=key self.falg=false self.linkname= " Htmlparser.__init__ (self) def handle_starttag (self,tag,attrs): if Self.key and tag ==self.key: Self.falg=true def handle_data (self,data): if Self.falg and data: self.data.append (Unicode (eval (repr (data)), "Utf-8")) def handle_endtag (self,tag): if Self.key and tag ==self.key: sel
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.