There are many useful tool classes in the Python standard library, but it is not clear how to use the detail description on the standard library document, such as URLLIB2, which is the HTTP client library. Here is a summary of some of the URLLIB2 library usage details.
1 Proxy Settings
2 Timeout Settings
3 Adding a specific Header to the HTTP Request
4 Redirect
5 Cookies
6 PUT and DELETE methods using HTTP
7 Getting the return code for HTTP
8 Debug Log
1 Proxy settings
URLLIB2 uses the environment variable HTTP_PROXY to set the HTTP proxy by default. If you want to explicitly control the Proxy in your program, and not be affected by environment variables, you can use the following method.
Import Urllib2 enable_proxy = Trueproxy_handler = Urllib2. Proxyhandler ({"http": ' http://some-proxy.com:8080 '}) Null_proxy_handler = Urllib2. Proxyhandler ({}) if Enable_proxy:opener = Urllib2.build_opener (proxy_handler) Else:opener = Urllib2.build_opener (nu Ll_proxy_handler) Urllib2.install_opener (opener)
One detail to note here is that using Urllib2.install_opener () sets the URLLIB2 global opener. This can be handy for later use, but not finer grained control, like using two different Proxy settings in a program. It is good practice not to use Install_opener to change the global settings, but simply call opener's Open method instead of the global Urlopen method.
2Timeout setting
In the older version, the Urllib2 API did not expose the timeout setting, and to set the timeout value, only the global timeout value of the Socket could be changed.
Import Urllib2import Socket Socket.setdefaulttimeout (10) # 10 seconds after timeout Urllib2.socket.setdefaulttimeout (10) # Another way
3Add a specific Header to the HTTP Request
To join the Header, you need to use the Request object:
Import URLLIB2 request = Urllib2. Request (URI) Request.add_header (' user-agent ', ' fake-client ') response = Urllib2.urlopen (Request)
Special attention should be paid to some of the headers, which are checked against these headers by the server side.
User-agent Some servers or proxies check this value to determine whether a browser-initiated Request
Content-type when using the REST interface, the Server checks the value to determine how the content in the HTTP Body should be parsed.
The common values are:
When using RPC to call a RESTful or SOAP service provided by server, the Content-type setting error causes the server to reject the service.
Application/xml: Used in XML RPC, such as Restful/soap call
Application/json: Used in JSON RPC calls
Application/x-www-form-urlencoded: Used when a Web form is submitted by the browser
......
4Redirect
URLLIB2 automatically Redirect actions for 3xx HTTP return codes by default, without manual configuration. To detect if a Redirect action has occurred, just check that the URL of the Response and the URL of the Request are consistent.
Import urllib2response = Urllib2.urlopen (' http://www.google.cn ') redirected = Response.geturl () = = ' http://www.google.cn '
If you do not want to automatically Redirect, you can use the custom Httpredirecthandler class in addition to the lower Httplib library.
Import Urllib2 class Redirecthandler (urllib2. Httpredirecthandler): Def http_error_301 (self, req, FP, code, MSG, headers): Pass Def http_error_302 (self, RE Q, FP, code, MSG, headers): Pass opener = Urllib2.build_opener (Redirecthandler) opener.open (' http://www.google.cn ')
5Cookies
Urllib2 the processing of cookies is also automatic. If you need to get the value of a Cookie entry, you can do this:
Import urllib2import Cookielib cookie = cookielib. Cookiejar () opener = Urllib2.build_opener (urllib2. Httpcookieprocessor (cookie)) response = Opener.open (' http://www.google.com ') for item in cookie:if item.name = = ' Some_co Okie_item_name ': Print Item.value
6PUT and DELETE methods using HTTP
URLLIB2 only supports the GET and POST methods of HTTP, and if you want to use HTTP PUT and DELETE, you can only use the lower-level httplib library. Nonetheless, we can enable URLLIB2 to issue HTTP PUT or DELETE packages in the following way:
Import URLLIB2 request = Urllib2. Request (URI, data=data) Request.get_method = lambda: ' PUT ' # or ' DELETE ' response = Urllib2.urlopen (Request)
7Get the return code for HTTP
For a $ OK, the return code for HTTP can be obtained as long as the GetCode () method of the response object returned by Urlopen is used. However, for other return codes, Urlopen throws an exception. At this point, you should check the Code property of the Exception object:
Import urllib2try:response = Urllib2.urlopen (' http://restrict.web.com ') except URLLIB2. Httperror, E:print E.code
8Debug Log
When using URLLIB2, you can use the following method to open the debug Log, so that the contents of the transceiver will be printed on the screen, convenient for us to debug, to a certain extent can eliminate the work of grasping the package.
Import Urllib2 HttpHandler = Urllib2. HttpHandler (debuglevel=1) Httpshandler = Urllib2. Httpshandler (debuglevel=1) opener = Urllib2.build_opener (HttpHandler, Httpshandler) Urllib2.install_opener (opener) Response = Urllib2.urlopen (' http://www.google.com ')
This article is from the "Fire" blog, so be sure to keep this source http://fire7758.blog.51cto.com/993821/1610851
Usage details of the Python standard library URLLIB2