Python sends HTTP requests and receives HTTP responses via get way, post mode-urllib urllib2

Source: Internet
Author: User
Tags urlencode

Http://www.cnblogs.com/poerli/p/6429673.html

Test with CGI, named test.py, placed in the Apache Cgi-bin directory:
#!/usr/bin/python
Import CGI
def main ():
Print "Content-type:text/html\n"
form = CGI. Fieldstorage ()
If Form.has_key ("Servicecode") and form["Servicecode"].value!= "":
Print "Else
Print "Main ()

Python sends Post and GET requests

Get Request:

When you use the Get method, the request data is placed directly in the URL.
Method One,
Import Urllib
Import Urllib2

url = "Http://192.168.81.16/cgi-bin/python_test/test.py?ServiceCode=aaaa"

req = Urllib2. Request (URL)
Print Req

Res_data = Urllib2.urlopen (req)
res = Res_data.read ()
Print Res

Method Two,
Import Httplib

url = "Http://192.168.81.16/cgi-bin/python_test/test.py?ServiceCode=aaaa"

conn = Httplib. Httpconnection ("192.168.81.16")
Conn.request (method= "get", Url=url)

Response = Conn.getresponse ()
res= Response.read ()
Print Res

POST request:

When you use the Post method, the data is placed in the information or body, cannot be placed in the URL, and is ignored in the URL.
Method One,
Import Urllib
Import Urllib2

Test_data = {' Servicecode ': ' AAAA ', ' B ': ' BBBBB '}
Test_data_urlencode = Urllib.urlencode (test_data)

Requrl = "http://192.168.81.16/cgi-bin/python_test/test.py"

req = Urllib2. Request (url = requrl,data =test_data_urlencode)
Print Req

Res_data = Urllib2.urlopen (req)
res = Res_data.read ()
Print Res


Method Two,
Import Urllib
Import Httplib
Test_data = {' Servicecode ': ' AAAA ', ' B ': ' BBBBB '}
Test_data_urlencode = Urllib.urlencode (test_data)

Requrl = "http://192.168.81.16/cgi-bin/python_test/test.py"
Headerdata = {"Host": "192.168.81.16"}

conn = Httplib. Httpconnection ("192.168.81.16")

Conn.request (method= "POST", url=requrl,body=test_data_urlencode,headers = Headerdata)

Response = Conn.getresponse ()

res= Response.read ()

Print Res
The use of JSON in Python is not clear, so the Urllib.urlencode (Test_data) method is used temporarily;

The difference of module Urllib,urllib2,httplib
Httplib implements the HTTP and HTTPS client protocols, but in Python, modules Urllib and URLLIB2 have a higher layer of encapsulation for httplib.

Describes the functions used in the following example:
1. httpconnection function

Httplib. Httpconnection (Host[,port[,stict[,timeout]])
This is a constructor that represents an interaction with the server at a time, that is, the request/response
Host identifies the server host (server IP or domain name)
Port defaults are 80
The strict mode is false, indicating that the state row returned by the server cannot be resolved, and whether the Badstatusline exception is thrown
For example:
conn = Httplib. Httpconnection ("192.168.81.16", 80) establishes a link with the server.


2, Httpconnection.request (Method,url[,body[,header]]) function
This is to send a request to the server
method is requested, usually by post or get,

For example:

Method= "POST" or method= "get"
URL requested resource, requested resource (page or CGI, we have CGI here)

For example:

Url= "http://192.168.81.16/cgi-bin/python_test/test.py" request CGI

Or

Url= "http://192.168.81.16/python_test/test.html" Request page
The body needs to submit data to the server, either in JSON or in the format above, JSON needs to call the JSON module
Headers requested HTTP Header headerdata = {"Host": "192.168.81.16"}
For example:
Test_data = {' Servicecode ': ' AAAA ', ' B ': ' BBBBB '}
Test_data_urlencode = Urllib.urlencode (test_data)
Requrl = "http://192.168.81.16/cgi-bin/python_test/test.py"
Headerdata = {"Host": "192.168.81.16"}
conn = Httplib. Httpconnection ("192.168.81.16", 80)
Conn.request (method= "POST", url=requrl,body=test_data_urlencode,headers = Headerdata)
Conn after use, should be closed, conn.close ()


3, Httpconnection.getresponse () function
This is to get the HTTP response, and the returned object is an instance of HttpResponse.


4, HttpResponse Introduction:
The HttpResponse properties are as follows:
Read ([Amt]) Gets the response message body, which reads the specified byte of data from the response stream and reads all the data when unspecified;
GetHeader (Name[,default]) Gets the response Header,name is the header domain name, and in the absence of a header domain name, default is used to specify the return value
Getheaders () to get header in the form of a list
For example:

Date=response.getheader (' Date ');
Print Date
Resheader= '
Resheader=response.getheaders ();
Print Resheader

Response header information in column form:

[(' Content-length ', ' 295 '), (' accept-ranges ', ' bytes '), (' Server ', ' Apache '), (' last-modified ', ' Sat, Mar 2012 10:07:0 2 GMT '), (' Connection ', ' close '), (' ETag ', ' "E8744-127-4bc871e4fdd80"), (' Date ', ' Mon, Sep 10:01:47 GMT '), (' Cont Ent-type ', ' text/html ')]

Date=response.getheader (' Date ');
Print Date

Remove the value of date from the response header. *************************************************************************************************************** *************************************************************************************************************** ************************

The so-called web crawl, is the URL address specified in the network resources from the network stream to read out, save to the local.
Similar to using the program to simulate the function of IE browser, the URL is sent as the content of the HTTP request to the server side, and then read the server-side response resources.

In Python, we use the URLLIB2 component to crawl a Web page.
URLLIB2 is a Python component that gets the URLs (uniform Resource Locators).

It provides a very simple interface in the form of a urlopen function.

The simplest URLLIB2 application code requires only four lines.

We create a new file urllib2_test01.py to feel the role of URLLIB2:

Import urllib2
response = Urllib2.urlopen (' http://www.baidu.com/')
html = response.read ()
Print HTML


Press F5 to see the results of the run:

We can open Baidu homepage, right click, choose to view the source code (Firefox or Google browser can), will find that the same content.

In other words, the above four lines of code will we visit Baidu when the browser received the code are all printed out.

This is one of the simplest examples of urllib2.

In addition to "http:", URLs can also be replaced with "ftp:", "File:" And so on.

HTTP is based on the request and answer mechanism:

The client requests, and the server provides the answer.

URLLIB2 uses a Request object to map the HTTP request you have made.

In its simplest form of use, you will create a request object with the address you are requesting.

By calling Urlopen and passing in the request object, a related request response object is returned.

This answer object is like a file object, so you can call it in response. Read ().

We create a new file urllib2_test02.py to feel:

Import urllib2 
req = urllib2. Request (' http://www.baidu.com ') 
response = Urllib2.urlopen (req) 
the_page = Response.read () 
print the_ Page

You can see that the output is the same as the test01.

URLLIB2 uses the same interface to handle all the URL headers. For example, you can create an FTP request as follows.

req = Urllib2. Request (' ftp://example.com/')

In the case of an HTTP request, you are allowed to do two additional things.

1. Send data Form

This content is believed to have done the web side is not unfamiliar,

Sometimes you want to send some data to a URL (usually a URL with a cgi[generic Gateway Interface] script, or another Web application hook).

In HTTP, this is often sent using a familiar post request.

This is usually done by your browser when you submit an HTML form.

Not all posts are derived from forms, and you can use post to submit arbitrary data to your own program.

For general HTML forms, data needs to be encoded into standard form. The data parameter is then uploaded to the request object.

The encoding work uses urllib functions rather than urllib2.

We create a new file urllib2_test03.py to feel:

Import urllib 
import urllib2 
url = ' http://www.someserver.com/register.cgi ' 
values = {' name ': ' WHY ', 
' Location ': ' SDU ', ' 
language ': ' Python '} 
data = Urllib.urlencode (values) # coding work
req = Urllib2. Request (URL, data) # Send requests to pass data form
response = Urllib2.urlopen (req) #接受反馈的信息
the_page = Response.read () #读取反馈的内容

If the data parameter is not transferred, the URLLIB2 uses the Get method request.

The difference between a get and post request is that the POST request usually has a "side effect",

They will change the state of the system in some way (for example, by submitting piles of rubbish to your door).

Data can also be transmitted by encoding the URL of the GET request itself.

Import urllib2 
Import urllib
data = {}
data[' name ' = ' WHY ' data[' location ' 
] = ' SDU 
' data[' Languag E '] = ' Python '
url_values = urllib.urlencode (data) 
print url_values
name=somebody+here&language= Python&location=northampton 
url = ' http://www.example.com/example.cgi ' 
full_url = URL + '? ' + url_values

This enables the get transfer of data.

2. Set headers to HTTP request

Some sites do not like to be accessed by programs (not for human access), or to send different versions of content to different browsers.

The default URLLIB2 takes itself as "python-urllib/x.y" (x and Y are Python major and minor versions, such as python-urllib/2.7),

This identity may confuse the site or simply don't work.

The browser confirms that its identity is through the user-agent header, and when you create a request object, you can give him a dictionary containing the header data.

The following example sends the same content as above, but simulates itself as an internet Explorer.

(Thanks to everyone for reminding, now this demo is not available, but the principle is still the same).

Import urllib 
import urllib2 
url = ' http://www.someserver.com/cgi-bin/register.cgi '
user_agent = ' mozilla/4.0 (compatible; MSIE 5.5; Windows NT) ' 
values = {' name ': ' WHY ', 
' location ': ' SDU ', 
' language ': ' Python '} 
  
   headers = {' user-agent ': user_agent} 
data = Urllib.urlencode (values) 
req = Urllib2. Request (URL, data, headers) 
response = Urllib2.urlopen (req) 

  

The above is Python uses URLLIB2 to crawl the content of the Web page through the specified URL, it is very simple, hope to be helpful to everybody

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.