1, the proxy setting URLLIB2 uses the environment variable HTTP_PROXY to set the HTTP proxy by default. If you want to explicitly control the Proxy in the program, not affected by the environment variables, you can use the following methodImportUrllib2enable_proxy=Trueproxy_handler= Urllib2. Proxyhandler ({"http":'http://some-proxy.com:8080'}) Null_proxy_handler=Urllib2. Proxyhandler ({})ifEnable_proxy:opener=Urllib2.build_opener (Proxy_handler)Else: Opener=Urllib2.build_opener (Null_proxy_handler) Urllib2.install_opener (opener) here to pay attention to a detail, using urllib2.instal L_opener () sets the global opener of the URLLIB2. This can be handy for later use, but not finer grained control, like using two different Proxy settings in a program. It is good practice not to use Install_opener to change the global settings, but simply call opener's Open method instead of the global Urlopen method. 2, timeout set in the old version, the Urllib2 API does not expose the timeout setting, to set the timeout value, you can only change the global Timeout value of the Socket. ImportUrllib2ImportSocketsocket.setdefaulttimeout (10)#timeout after 10 secondsUrllib2.socket.setdefaulttimeout (10)#a different wayin the new Python2.6version, timeouts can be set directly through the timeout parameter of Urllib2.urlopen (). ImportUrllib2response= Urllib2.urlopen ('http://www.google.com', timeout=10)3, to add a specific header to the HTTP request to join the header, you need to use the Request object:Importurllib2request=Urllib2. Request (URI) Request.add_header ('user-agent','fake-client') Response=Urllib2.urlopen Special attention to some of the headers, the Server side will check the header for the user-Agent Some Server or Proxy checks this value to determine if it is a browser-initiated requestcontent-when you use the REST interface, the Server checks the value to determine how the content in the HTTP Body should be parsed. The common values are: application/xml: In XML RPC, such as restful/Use Application when SOAP calls/JSON: Using application in JSON RPC calls/x-www-form-urlencoded: When a Web Form is submitted by a browser using a RESTful or SOAP service provided by RPC call Server, the Content-The Type setting error causes the Server to reject the service. 4, REDIRECTURLLIB2 will automatically perform Redirect actions for 3xx HTTP return codes by default, without manual configuration. To detect if a Redirect action has occurred, just check that the URL of the Response and the URL of the Request are consistent. ImportUrllib2response= Urllib2.urlopen ('http://www.google.cn') redirected= Response.geturl () = ='http://www.google.cn'If you do not want to automatically Redirect, you can use the custom Httpredirecthandler class in addition to the lower Httplib library. ImportUrllib2classRedirecthandler (urllib2. Httpredirecthandler):defhttp_error_301 (self, req, FP, code, MSG, headers):Pass defhttp_error_302 (self, req, FP, code, MSG, headers):Passopener=Urllib2.build_opener (Redirecthandler) Opener.open ('http://www.google.cn') 5and Cookieurllib2 the processing of cookies is also automatic. If you need to get the value of a Cookie entry, you can do this:ImportUrllib2ImportCookielibcookie=Cookielib. Cookiejar () opener=Urllib2.build_opener (urllib2. Httpcookieprocessor (cookie)) Response= Opener.open ('http://www.google.com') forIteminchCookies:ifItem.name = ='Some_cookie_item_name': PrintItem.value6, the PUT and Delete methods using HTTP URLLIB2 only support the GET and POST methods for HTTP, and if you want to use HTTP put and delete, you can only use the lower-level httplib library. Nonetheless, we can enable URLLIB2 to issue HTTP PUT or DELETE packages in the following way:Importurllib2request= Urllib2. Request (URI, data=data) Request.get_method=Lambda:'PUT' #or ' DELETE 'Response =Urllib2.urlopen This approach is Hack, but it's not really a problem. 7, get the return code of HTTP for200OK, the return code for HTTP can be obtained as long as the GetCode () method of the response object returned by Urlopen is used. However, for other return codes, Urlopen throws an exception. At this point, you should check the Code property of the exception object:ImportUrllib2Try: Response= Urllib2.urlopen ('http://restrict.web.com')exceptUrllib2. Httperror, E:PrintE.code8, debug log when using URLLIB2, you can use the following method to open the debug log, so that the contents of the transceiver will be printed on the screen, convenient for us to debug, to a certain extent can eliminate the work of grasping the package. ImportUrllib2httphandler= Urllib2. HttpHandler (debuglevel=1) Httpshandler= Urllib2. Httpshandler (debuglevel=1) Opener=Urllib2.build_opener (HttpHandler, Httpshandler) Urllib2.install_opener (opener) Response= Urllib2.urlopen ('http://www.google.com')
Use of the Python standard library urllib2