Installation and simple application of Python requests

Source: Internet
Author: User

Requests is a Python HTTP client library, similar to URLLIB,URLLIB2, why use requests instead of URLLIB2? This is stated in the official documentation:

Python's standard library URLLIB2 provides most of the HTTP functionality needed, but the API is too counter-trivial, and a simple function requires a lot of code.

I also read the requests document, which is really very simple, suitable for my lazy people. Here are some simple guides.

A good news! Just see requests have a Chinese translation version, suggest English bad look, content also better than my blog, the specific link is:http://cn.python-requests.org/en/latest/( But v1.1.0 version, but also sorry, before the wrong link.

1. Installation

Installation is very simple, I am win system, here Download the installation package (download the Zipball link in the webpage), and then installed $ python setup.py install .
Of course, one easy_install or more pip of your friends can use it directly: easy_install requests or pip install requests to install it.
As for Linux users, this page also has other installation methods.
Test: Input in idle import requests , if not prompt error, that means that the installation has been successful!

2. Small trial Sledgehammer
>>>ImportRequests>>>R=Requests.Get(' Http://www.zhidaow.com ')# send request >>> r. Status_code # return code 200>>> r headers[ ' Content-type '  # return header information  ' text/html; Charset=utf8 ' >>> r.# encoded information  ' utf-8 ' >>> r text  #内容部分 (PS, it is recommended to use r.content here due to coding problems) u ' <! DOCTYPE html>\n               

Isn't it simple? More simple and intuitive than URLLIB2 and urllib?! Then go ahead and see the quick Guide.

3. Quick Guide 3.1 sending requests

To send the request is simple, first import the requests module:

>>>Requests

Next, let's get a webpage, such as the homepage of my personal blog:

>>>requests.  Get(' http://www.zhidaow.com ')     

Next, we can use the r various methods and functions of this.
In addition, there are many types of HTTP requests, such as Post,put,delete,head,options. can also be implemented in the same way:

>>>R=Requests.Post("Http://httpbin.org/post")>>>R= requests. Put ( "http://httpbin.org/put" ) >>> r = requests. Delete ( "Http://httpbin.org/delete" ) >> > r = requests. Head ( "Http://httpbin.org/get" ) >>> span class= "n" >r = requests. Options ( "Http://httpbin.org/get" )      

Because I don't have any of this at the moment, so I didn't go into it.

3.2 Passing parameters in URLs

Sometimes we need to pass parameters in the URL, such as when we collect Baidu search results, we WD parameters (search terms) and RN parameters (the number of search results), you can manually compose Url,requests also provides a look very NB method:

>>>Payload= { ' WD ' :  ' Zhang Yanan '  Span class= "s" > ' RN ' :  ' "}>>> r = requests. Get ( "http://www.baidu.com/s" params =payload) >>> print span class= "n" >r. Urlu ' Http://www.baidu.com/s?rn=100&wd=%e5%BC%A0 %e4%ba%9a%e6%a5%a0 '     

The above wd= garbled is the "Zhang Yanan" transcoding form. (It seems that the parameters are sorted by the initial letter.) )

3.3 Getting response content

You can r.text get the contents of a Web page by.

Requests.  Get(' https://www.zhidaow.com ')R.  Textu ' <! DOCTYPE html>\     

The document says that requests will automatically transcode the content. Most Unicode fonts will be transcoded seamlessly. But I always appear in the Cygwin under the UnicodeEncodeError wrong, depressed. It's perfectly normal in Python's idle.
In addition, you can also r.content get the page content.

Requests.  Get(' https://www.zhidaow.com ')R.  ContentB' <! DOCTYPE html>\     

The document says that r.content it is displayed in bytes, so it begins in idle b . But I did not use it in Cygwin, download the page just right. So it replaces the function of Urllib2 urllib2.urlopen(url).read() . (Basically I use the most of a feature.) )

3.4 Get page encoding

Can be used r.encoding to get the page encoding.

Requests.  Get(' http://www.zhidaow.com ')R.  Encoding' utf-8 '        

When you send a request, requests guesses the page encoding based on the HTTP header, and requests uses that code when you use it r.text . Of course you can also modify the requests encoding.

Requests.  Get(' http://www.zhidaow.com ')R.  Encoding 'utf-8 '>>>r.' Iso-8859-1 '            

Like the example above, the modified code will be used to get the content of the Web page directly after encoding.

3.5 JSON

Like Urllib and URLLIB2, if you use JSON, you need to introduce new modules, such as json and simplejson , but already have built-in functions in requests r.json() . Take the API for querying IP:

>>>requests.  Get(' http://ip.taobao.com/service/getIpInfo.php?ip=122.88.60.28 ')>>>R.  JSON() [' data '] [' country ']' China '        
3.6 Page Status Code

We can use it r.status_code to check the status code of the Web page.

>>>R=Requests.Get(' Http://www.mengtiankong.com ')>>>R.Status_code200>>>R= requests. Get ( ' http://www.mengtiankong.com/123123/' ) > >>r. Status_code404>>>r =  Requests. Get ( ' http://www.baidu.com/link?url= Qetrfos7tuuqrppa0wltjjr6ffiyi1djprjukx4qy0xnsdo_s9baoo8u1wvjxgqn ' ) >>> r. Urlu ' Http://www.zhidaow.com/>>>r.< span class= "n" >status_code200           

The first two examples are normal, can normally open return 200, does not open normally return 404. But the third one is a bit strange, that is Baidu search results in the 302 jump address, but the status code display is 200, then I used a trick let him show his true colours:

>>>R.  History(<[302]>,)         

Here it can be seen that he is using a 302 jump. Perhaps some people think that this can be judged and regular to get the status code of the jump, in fact, there is a simpler way:

>>>requests.  Get(' Http://www.baidu.com/link?url=QeTRFOS7TuUQRppa0wlTJJr6FfIYI1DJprJukx4Qy0XnsDO_s9baoO8u1wvjxgqN ' False)>>>R.  Status_code302           

Just add a parameter allow_redirects , prohibit jump, directly appear jump status code, easy to use it? I also used this in the last one to do a simple to get the page status code small application, the principle is this.

3.7 Response Header Content

You can r.headers get the response header content by.

>>>requests.  Get(' http://www.zhidaow.com ')R.  Headers' content-encoding 'gzip 'transfer-encoding 'chunked 'content-type' text/ html Charset=utf-8 ' ... }

You can see that everything is returned as a dictionary, and we can also access some of the content.

R.  Headers[' Content-type ']' text/html; Charset=utf-8 'R.  Headers.  Get(' Content-type ')' text/html; Charset=utf-8 '       
3.8 Setting the time-out period

We can set the time timeout -out through the property, and if we don't get the response at this time, we'll be prompted with an error.

>>>Requests.Get(' Http://github.com ',Timeout=0.001)Traceback(MostRecentPagerLast): file  "<stdin>"  line Span class= "Mi" >1in <module >requests. Exceptions. Timeout: httpconnectionpool (host = ' github.com ' port= 80): request timed out  (timeout=0.001)     
3.9 Delegate Access

At the time of collection to avoid being blocked IP, agent is often used. The requests also has corresponding proxies properties.

Requests{  "http""http://10.10.1.10:3128" "https""http://10.10.1.10:1080"  ,}requests.  Get("http://www.zhidaow.com"proxies=proxies)      

This is required if the agent requires an account and password:

{    "http""Http://user:[email protected]:3128/",}   
3.10 Request Header Content

Request header content can be used r.request.headers to obtain.

R.  Request.  Headers{' accept-encoding 'identity, deflate, compress, gzip ',' Accept ' '*/* ' User-agent 'python-requests/1.2.3 cpython/2.7.3 windows/xp '}       
3.11 Customizing the request header

The disguise request header is often used when collecting, we can use this method to hide:

R=Requests.Get(' Http://www.zhidaow.com ')PrintR.Request.headers[ ' user-agent ' ]# python-requests/1.2.3 cpython/2.7.3 Windows/xpheaders = {  ' user-agent ' :  ' Alexkh ' }r = requests. Get ( ' http://www.zhidaow.com ' headers = headers) print r request. Headers[ ' user-agent ' ] #alexkh    
3.12 Persistent Connection Keep-alive

The keep-alive of requests is based on URLLIB3, and the persistent connection within the same session is completely automatic. All requests within the same session will automatically use the appropriate connection.

That is, you do not need any settings, requests will automatically implement Keep-alive.

4. Simple application 4.1 Get the page return code
DefGet_status(Url):R=Requests.Get(Url,Allow_redirects=False)ReturnR.Status_codePrintget_status ( ' http://www.zhidaow.com '  #200 print get_status ()  #404 print get_status< Span class= "P" > ( ' http://mengtiankong.com ' )  #301 get_status ( ' http://www.baidu.com/link?url= Qetrfos7tuuqrppa0wltjjr6ffiyi1djprjukx4qy0xnsdo_s9baoo8u1wvjxgqn ' )  #302 print get_status ( ' http://www.huiya56.com/ Com8.intre.asp?46981.html ' )  #500        

Installation and simple use of Python requests

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.