Cookies, which are data stored on the user's local terminal (usually encrypted) by certain websites in order to identify users and perform session tracking.
For example, some sites need to log in to access a page, before you log in, you want to crawl a page content is not allowed. Then we can use the URLLIB2 library to save our registered cookies, and then crawl the other pages to achieve the goal.
[Email protected]~]# cat cscook.py
#!/usr/bin/python#-*-coding:utf-8-*-import cookielibimport urllib2import urllib#cookie saved file name filename= ' Cookie.txt ' #声明一个MozillaCookieJar对象实例来保存cookiecookie =cookielib. Mozillacookiejar (filename) #创建opener用于读取Urlopener =urllib2.build_opener (urllib2. Httpcookieprocessor (cookie)) #登录的用户名密码values ={"username": "1767340368", "Password": "xxxxxxxx"} #模拟登录postdata = Urllib.urlencode (values) #登录的UrlloginUrl = "http://home.51cto.com/index?reback=http://www.51cto.com/" #登录result = Opener.open (loginurl,postdata) #保存cookie到cookie. txt Cookie.save (ignore_discard=true, Ignore_expires=true) # Use cookies to request access to another URL gradeurl= "http://blog.51cto.com/1767340368" Result=opener.open (gradeurl) print result.read ()
Python uses cookies to impersonate a website login