Garbled after running console in Pycharm. Required files >> Settings >> editor >> file encoding
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/74/08/wKiom1YOiVnyPmlQAAGpMdeT9eU557.jpg "title=" p1.png "alt=" Wkiom1yoivnypmlqaagpmdet9eu557.jpg "/>
Crawling Web pages
#-*-coding:utf-8-*-import requests# Chinese code utf-8import sysreload (SYS) sys.setdefaultencoding (' utf-8 ') #模拟浏览器hea = {' User-agent ': ' mozilla/5.0 (Windows NT 6.2; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/39.0.2171.71 safari/537.36 '}url = ' #爬取链接html = requests.get (' url ', headers = hea) Print html.textprint ' Start crawling content ... ‘
Simulated landing crawler, #带有cookie
The key is how to get cookies
Ps.cookie encounter every landing change to pay attention to change points, often change the place is random code
Method One: By grasping the package artifact--fiddler
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/74/08/wKiom1YOiyiwkmIxAAWE1QvHKIw485.jpg "style=" float: none; "title=" P3.png "alt=" Wkiom1yoiyiwkmixaawe1qvhkiw485.jpg "/>
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/74/05/wKioL1YOizigmSK-AAU9UJUl6C0192.jpg "style=" float: none; "title=" fear. png "alt=" wkiol1yoizigmsk-aau9ujul6c0192.jpg "/>
Method 2:
Review elements directly with IE
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/74/05/wKioL1YOi-GTV2QGAAJr6VIN_5g852.jpg "title=" p4.png "alt=" Wkiol1yoi-gtv2qgaajr6vin_5g852.jpg "/>
#-*-coding:utf-8-*-import Requestsimport Recook = {' Cookie ': '} url = ' html = requests.get (URL, cookie = cook). Content Print HTML
This article from "Michelle" blog, declined reprint!
A probe into Python crawler