Share the web crawler Code implemented by python.
Write a python 3. The code is very simple and can be directly pasted with the code.
# Test rdpimport urllib. requestimport re <br> # login account information data = {} data ['fromurl'] = 'data ['fromurltemp '] = 'data ['loginid'] = '000000' data ['Password'] = '000000' user _ agent = 'mozilla/12345 (compatible; MSIE 5.5; Windows NT) '# logon address # url = 'HTTP: // 192.168.1.111: 8080/logincheck' postdata = urllib. parse. urlencode (data) postdata = postdata. encode ('utf-8') headers = {'user-agent': user_agent} # login res = urllib. request. urlopen (u Rl, postdata) # obtain the page html <br> strResult = (res. read (). decode ('utf-8') # use A regular expression to retrieve all the tags p = re. compile (R' <a href = "(. *?) ". *?> (.*?) </A> ') for m in p. finditer (strResult): print (m. group (1) # group (1) is the content in href, and group (2) is the text in label.
I did not take the time to process cookies, exceptions, and so on. After all, I just wanted to learn python by writing crawlers.
Articles you may be interested in:
- Python web crawler instance code
- Example of using python web crawler to collect Lenovo words
- Python blog article crawler implementation code
- A small example of Python web crawler
- Python web crawler (Classic and practical)
- Python implementation code of Netease news Crawlers
- Python web crawler code
- Python crawlers that start searching from Baidu
- How to compile distributed crawler in python
- Python-based weather forecasting collector (Web Crawler) Tutorial