1, Urllib2 can accept an instance of a request class to set URL request Headers,urllib can only accept URLs. This means that you can not disguise your user agent string, and so on. 2, Urllib provides UrlEncode method to get query string generation, and Urllib2 does not. This is why urllib often used with URLLIB2.
methods for Urllib get data (params is the keyword)
#!/usr/bin/python
#coding =utf-8
import urllib,urllib2
uri = ' http://www.qieke.com/index.php '
params = {
' _c ': ' User ',
' _m ': ' Info ',
};
params[' user_id ']= 123456
params[' user_name '] = ' all Aberdeen '
params = Urllib.urlencode (params)
ret = Urllib.urlopen ("%s?%s"% (URI, params))
code = Ret.getcode ()
ret_data = Ret.read ()
methods of Urllib post data
#!/usr/bin/python
#coding =utf-8
import urllib,urllib2
uri = ' http://www.qieke.com/index.php '
params = {
' _c ': ' User ',
' _m ': ' Info ',
};
params[' user_id ']= 123456
params[' user_name '] = ' all Aberdeen '
params = Urllib.urlencode (params)
ret = Urllib.urlopen (URI, params)
code = Ret.getcode ()
ret_data = Ret.read ()
Urllib2
The simplest use of URLLIB2 will be shown below
Import urllib2
response = Urllib2.urlopen (' http://python.org/')
Many of URLLIB2 's applications are simple (remember that, in addition to "http:", URLs can also be replaced with "ftp:", "File:" and so on). But this article is a more complex application that teaches HTTP.
HTTP is based on the request and response mechanism-the client requests, and the server provides the response. URLLIB2 uses a Request object to map your HTTP request, and in its simplest form you will use the
Address creates a request object that, by calling Urlopen and passing in the Request object, returns a related request response object, which is like a file object, so you can call it in response. Read ().
Import urllib2
req = urllib2. Request (' http://www.voidspace.org.uk ')
response = Urllib2.urlopen (req)
Remember that URLLIB2 uses the same interface to handle all the URL headers. For example, you can create an FTP request as follows.
req = Urllib2. Request (' ftp://example.com/')
In the case of an HTTP request, you are allowed to do two additional things. The first is that you can send data forms, and second you can send additional information about the data or send itself ("metadata") to the server, which is sent as an HTTP "headers".
Data
Sometimes you want to send some data to a URL (usually a URL with a cgi[generic Gateway Interface] script, or another Web application hook). In HTTP, this is often sent using a familiar post request. This is usually done by your browser when you submit an HTML form.
Not all posts are derived from forms, and you can use post to submit arbitrary data to your own program. For general HTML forms, data needs to be encoded into standard form. The data parameter is then uploaded to the request object. Coding works using Urllib functions rather than
Import urllib
import urllib2
url = ' http://www.someserver.com/cgi-bin/register.cgi '
values = {' Name ': ' Michael Foord ',
' location ': ' Northampton ',
' language ': ' Python '}
data = Urllib.urlencode (values)
req = Urllib2. Request (URL, data)
response = Urllib2.urlopen (req)
the_page = Response.read ()
If Ugoni does not transmit the data parameter, URLLIB2 the request using the Get method. The difference between get and post requests is that post requests usually have "side effects" that change the system state in some way (for example, by submitting piles of garbage to your door).
Although the HTTP standard makes it clear that posts usually produces side effects, get requests do not have side effects, but nothing can prevent a GET request from having a side effect, and the same POST request may not have a side effect. Data can also be passed on the GET request
The URL itself is encoded above to transmit.
You can see the following examples
>>> Import urllib2
>>> import urllib
>>> data = {}
>>> data[' name '] = ' Somebody here '
>>> data[' location '] = ' Northampton '
>>> data[' language '] = ' Python '
>>> url_values = urllib.urlencode (data)
>>> print url_values
name=somebody+here& Language=python&location=northampton
>>> url = ' http://www.example.com/example.cgi '
>> > full_url = URL + '? ' + url_values
Headers
Discuss specific HTTP headers to show how to add headers to your HTTP request.
Some sites do not like to be accessed by programs (not for human access), or to send different versions of content to different browsers. The default URLLIB2 takes itself as "python-urllib/x.y" (x and Y are Python major and minor versions, such as python-urllib/2.5),
This identity may confuse the site or simply don't work. The browser confirms that its identity is through the user-agent header, and when you create a request object, you can give him a dictionary containing the header data. The following example sends the same content as above, but simulates itself as an internet Explorer.
Import urllib
import urllib2
url = ' http://www.someserver.com/cgi-bin/register.cgi '
user_agent = ' mozilla/4.0 (compatible; MSIE 5.5; Windows NT) '
values = {' name ': ' Michael foord ',
' location ': ' Northampton ',
' language ': ' Python ' head
ers = {' user-agent ': user_agent}
data = Urllib.urlencode (values)
req = Urllib2. Request (URL, data, headers)
response = Urllib2.urlopen (req)
There are also two very useful ways to response an answer object. Looking at the section info and Geturl below, we will see what happens when an error occurs.
Handle Exceptions Handling Exceptions
When Urlopen is unable to handle a response, a urlerror is generated (although common Python APIs such as Valueerror,typeerror are also generated).
Httperror is a subclass of Urlerror that is typically generated in a specific HTTP URL.
Urlerror
Typically, urlerror occurs without a network connection (not routed to a specific server), or if the server does not exist. In this case, the exception will also have the "reason" attribute, which is a tuple that contains an error number and an error message.
For example
>>> req = urllib2. Request (' http://www.pretend_server.org ')
>>> try:urllib2.urlopen (req)
>>> except Urlerror, E:
>>> print E.reason
Error codes fault code
Because the default processor handles redirects (numbers other than 300), and 100-299-range numbers indicate success, you can see only 400-599 of the error numbers.
BaseHTTPServer.BaseHTTPRequestHandler.response is a useful dictionary of answer numbers that shows all the answer numbers used in RFC 2616. Here for the convenience of showing the dictionary again. (translator slightly)
When an error number is generated, the server returns an HTTP error number, and an error page. You can use the Httperror instance as the answer object returned by the page response. This represents the same as the error attribute, which also contains the Read,geturl, and the info method.
>>> req = urllib2. Request (' http://www.python.org/fish.html ')
>>> try:
>>> urllib2.urlopen (req)
>>> except Urlerror, E:
>>> print e.code
>>> print e.read ()
Wrapping it up Package
So if you want to prepare for httperror or urlerror, there will be two basic approaches. I like the second kind better.
The first one:
[Python:nogutter]
From URLLIB2 import Request, Urlopen, Urlerror, Httperror
req = Request (Someurl)
Try
Response = Urlopen (req)
Except Httperror, E:
print ' The server couldn/' t fulfill the request. '
print ' Error code: ', E.code
Except Urlerror, E:
print ' We failed to reach a server. '
print ' Reason: ', E.reason
Else
# everything is fine
Note: Except Httperror must be in the first, otherwise except Urlerror will likewise receive httperror.
The second one:
[Python:nogutter]
From URLLIB2 import Request, Urlopen, Urlerror
req = Request (Someurl)
Try
Response = Urlopen (req)
Except Urlerror, E:
If Hasattr (E, ' reason '):
print ' We failed to reach a server. '
print ' Reason: ', E.reason
Elif hasattr (E, ' Code '):
print ' The server couldn/' t fulfill the request. '
print ' Error code: ', E.code
Else
# everything is fine
info and Geturl
Urlopen returns an Answer object response (or Httperror instance) has two very useful methods info () and Geturl ()
Geturl-This is useful as a real URL to get back, because Urlopen (or opener object) may
There will be redirects. The URL you get may be different from the request URL.
Info-This returns the object's Dictionary object, which describes the obtained page condition. Typically, the server sends a specific header headers. The present is httplib. Httpmessage instance.
The classic headers contains "Content-length", "Content-type", and others. View Quick Reference to HTTP Headers (http://www.cs.tut.fi/~jkorpela/http.html)
Gets a list of useful HTTP headers, as well as their explanatory meaning.