Urllib2 provides a wide range of URL-based resource processing methods ~ You can use handler to implement various functions ~ Likewise, automatic redirect and cookie analysis and acquisition are implemented based on status code (redirect Based on HTTP status code is also implemented in urllib. fancyurlopener ~)
The step-by-step code is as follows:
- Import urllib2 as ul2, cookielib as Cl, urllib as UL
- Cj = Cl. cookiejar ()
- Opener = ul2.build _ opener (ul2.httpcookieprocessor (CJ) # Here you can add more handlers
- # Ul2.install _ opener (opener) # You can register as the default opener. In this way, you can use urlopen/urlretrieve and other shortcuts when sending requests later.
- # Create and send a request
- Req_sohu = urllib2.request ('HTTP: // www.sohu.com ') # Use the request object to send a POST request and provide more parameters. If a GET request does not require complicated settings, can be used directly
- Res_sohu = opener. Open (req_sohu)
- Content_sohu = res_sohu.read () # Read all response content
- # The corresponding HTTP request and Response Headers are in req_sohu and res_sohu, which can be accessed and obtained directly.
- # Obtain cookies ...... Very troublesome ...... I wonder if there is any easy way ......
- Cookies = filter (lambda H: isinstance (H, ul2.httpcookieprocessor), opener. handlers) [0]. cookiejar
For more information about cookies, see cookielib. cookiejar.
This cookie exists throughout the entire opener lifecycle and automatically analyzes the validity period, domain, path, and maintenance. It is very suitable for simulating logon and capturing ~
------------
The title is redirect ...... Not reflected in the Code ~
In row 5th, when creating opener, ul2.httpredirecthandler is one of the handler added by default ......
Handler is also the default handler:
Proxyhandler
Unknownhandler
Httphandler
Httpdefaulterrorhandler
Httpredirecthandler
Ftphandler
Filehandler
Httpshandler
Httperrorprocessor
Wait for you ~