The urllib2.urlopen () function does not support authentication, cookie, or other advanced HTTP functions. To support these functions, you must use the build_opener () function to create a custom Opener object.
1. build_opener ([handler1 [handler2,...])
Handler is a Handler instance. HTTPBasicAuthHandler, HTTPCookieProcessor, and ProxyHandler are commonly used.
The objects returned by build_opener () have the open () method, which is the same as the functions of the urlopen () function.
To modify the http header, you can use:
import urllib2opener = urllib2.build_opener()opener.addheaders = [('User-agent', 'Mozilla/5.0')]opener.open('http://www.example.com/')
2. install_opener (opener)
Install different opener objects as the global opener used by urlopen.
3. Password verification (HTTPBasicAuthHandler)
The HTTPBasicAuthHandler () handler can use add_password () to set the password.
H. add_password (realm, uri, user, passwd)
Realm is the name or description associated with the verification, depending on the remote server. Uri is the base URL. User and passwd specify the user name and password respectively.
import urllib2auth=urllib2.HTTPBasicAuthHandler()auth.add_password('Administrator','http://www.example.com','Dave','123456')opener=urllib2.build_opener(auth)u=opener.open('http://www.example.com/evilplan.html')
4. Cookie processing (HTTPCookieProcessor)
import urllib2,cookielibcookie=cookielib.CookieJar()cookiehand=urllib2.HTTPCookieProcessor(cookie)opener=urllib2.build_opener(cookiehand)
5. Proxy (ProxyHandler)
The ProxyHandler (proxies) parameter proxies is a dictionary that maps protocol names (http, ftp) and so on to the URL of the corresponding proxy server.
proxy=ProxyHandler({'http':'http://someproxy.com:8080'})auth=HTTPBasicAuthHandler()auth.add_password()opener=build_opener(auth,proxy)
You can also use a proxy in urlopen.
import urllib2 proxy = 'http://%s:%s@%s' % ('userName', 'password', 'proxy') inforMation = urllib2.urlopen("http://www.example.com", proxies={'http':proxy})