Requests: Let HTTP serve humans _

Requests: Let HTTP serve humans __ Crawler tutorial

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Requests supports HTTP connection retention and connection pooling, supports the use of cookies to hold sessions, supports file uploads, supports automatic coding of response content, and supports internationalized URL and POST data encoding automatically.

Requests's documentation is very complete and Chinese documents are pretty good. Requests can fully meet the needs of the current network, support Python 2.6-3.5, and can run perfectly under PyPy.

Open Source Address: https://github.com/kennethreitz/requests

Chinese document api:http://docs.python-requests.org/zh_cn/latest/index.html installation Method

Install with PIP or use Easy_install to complete the installation:

$ pip Install requests

$ easy_install Requests

Basic Get requests (headers parameters and Parmas parameters) 1. The most basic GET request can be directly used

Response = Requests.get ("http://www.baidu.com/")

# can also be written like this
# response = requests.request ("Get", "http:// www.baidu.com/")

2. Add headers and query parameters

If you want to add headers, you can pass in the headers parameter to increase the headers information in the request header. If you want to pass the parameter in the URL, you can take advantage of the params parameter.

Import requests

kw = {' WD ': ' Great Wall '}

headers = {"User-agent": "mozilla/5.0 (Windows NT 10.0; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/54.0.2840.99 safari/537.36 "}

# params receives a dictionary or string of query parameters, The dictionary type is automatically converted to URL encoding and does not require urlencode ()
response = Requests.get ("http://www.baidu.com/s?", params = kw, headers = headers)

# View response content, Response.text returns data
print Response.text # in Unicode format

view response content, Response.content returned byte stream data
Print Respones.content

# view full URL address
print response.url

# View response header character encoding
print response.encoding

# View Response Code
print Response.status_code

Run results

......

......

' Http://www.baidu.com/s?wd=%E9%95%BF%E5%9F%8E '

utf-8 '

200

When using Response.text, Requests automatically decodes response content based on the text encoding of the HTTP response, and most Unicode character sets are seamlessly decoded.

When using Response.content, the original binary byte stream of the server response data is returned, which can be used to save binary files such as pictures. Basic POST request (data parameter) 1. The most basic GET request can be directly using the Post method

Response = Requests.post ("http://www.baidu.com/", data = data)

2. Incoming Data

For POST requests, we generally need to add some parameters to it. Then the most basic method of passing parameters can use the parameter of data.

Import requests

Formdata = {
    "type": "AUTO",
    "I": "I Love Python",
    "doctype": "JSON",
    "XMLVersion" : "1.8",
    "Keyfrom": "Fanyi.web", "
    UE": "UTF-8",
    "action": "Fy_by_enter",
    "Typoresult": "True"
}

url = "http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule&smartresult=ugc& Sessionfrom=null "

headers={" user-agent ":" mozilla/5.0 (Windows NT 10.0; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/51.0.2704.103 safari/537.36 "}

response = Requests.post (URL, data = formdata, headers = headers)

print Response.text

# If the JSON file can display
print Response.json directly ()

Run results

{"type": "En2zh_cn", "ErrorCode": 0, "ElapsedTime": 2, "Translateresult": [[{"src": "I Love Python", "TGT": "I Like Python"}]] , "Smartresult": {"type": 1, "Entries": ["", "Wen", "Gartner"]}}

{u ' errorcode ': 0, U ' elapsedtime ': 0, U ' translateresult ': [[{ U ' src ': U ' I love python ', U ' TGT ': U ' \u6211\u559c\u6b22python '}], U ' smartresult ': {u ' type ': 1, U ' entries ': [u ', U ' \u8086 \u6587 ', U ' \u9ad8\u5fb7\u7eb3 ']}, U ' type ': U ' en2zh_cn '}

agent (proxies parameter)

If you need to use a proxy, you can configure a single request by providing proxies parameters for any of the request methods:

Import Requests

# Depending on the protocol type, select a different agent
proxies = {
  "http": "http://12.34.56.79:9527",
  "https": "http:// 12.34.56.79:9527 ",
}

response = Requests.get (" http://www.baidu.com ", proxies = proxies)
Print Response.text

You can also configure the agent through local environment variables http_proxy and Https_proxy:

Export http_proxy= "http://12.34.56.79:9527"
export https_proxy= "https://12.34.56.79:9527"

Private proxy authentication (specific format) and Web Client authentication (auth parameters) Private Agent

Import Requests

# If the proxy needs to use HTTP Basic Auth, you can use the following format:
proxy = {"http": "mr_mao_hacker:sffqry9r@61.158.163.130 : 16816 "}

response = Requests.get (" http://www.baidu.com ", proxies = proxy)

print Response.text

Web Client Authentication

If Web client authentication is required, add auth = (account name, password)

Import requests

auth= (' Test ', ' 123456 ')

response = Requests.get (' http://192.168.199.107 ', auth = auth)

Print Response.text

Urllib2 Tears Running ... Cookies and Sission Cookies

If a cookie is included in a response, then we can use the cookies parameter to get:

Import Requests

response = Requests.get ("http://www.baidu.com/")

# 7. Return Cookiejar object:
Cookiejar = Response.Cookies

# 8. Convert Cookiejar to Dictionary:
cookiedict = Requests.utils.dict_from_cookiejar (Cookiejar)

Print Cookiejar

print cookiedict

Run Result:

<requestscookiejar[<cookie bdorz=27315 for .baidu.com/>]>

{' Bdorz ': ' 27315 '}

sission

In requests, the session object is a very commonly used object that represents a user conversation: Starting with the client browser connecting to the server and disconnecting the client browser from the server.

Sessions allow us to maintain certain parameters across requests, such as keeping cookies between all requests issued by the same session instance. realize Renren Login

Import Requests

# 1. To create a Session object, you can save the cookie value
ssion = Requests.session ()

# 2. Process headers
headers = {" User-agent ":" mozilla/5.0 (Windows NT 10.0; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/54.0.2840.99 safari/537.36 "}

# 3. Username and password required to log in
data = {" Email " : "mr_mao_hacker@163.com", "Password": "Alarmchime"}  

# 4. Send a request with a username and password and get the cookie value after login, saved in Ssion
Ssion.post ("http://www.renren.com/PLogin.do", data = data)

# 5. Ssion contains the cookie value after the user logs in and can directly access those pages
that are logged in before they can be accessed Response = Ssion.get ("Http://www.renren.com/410043129/profile")

# 6 Printing response content print
Response.text

processing HTTPS request SSL certificate validation

Requests can also authenticate SSL certificates for HTTPS requests: To check the SSL certificate for a host, you can use the Verify parameter (or not)

Import Requests
response = Requests.get ("https://www.baidu.com/", Verify=true)

# can also be omitted not to write
# response = Requests.get ("https://www.baidu.com/")
print R.text

Run Result:

<! DOCTYPE html>
<!--STATUS ok-->If SSL certificate authentication does not pass, or if the server's security certificate is not trusted, the Sslerror is reported, and 12306 certificates are said to have been made by themselves:

To test:

Import Requests
response = Requests.get ("https://www.12306.cn/mormhweb/")
print Response.text
 
Really

Sslerror: ("Bad Handshake:error" ([(' SSL routines ', ' ssl3_get_server_certificate ', ' certificate verify failed ')],)

If we want to skip 12306 certificate validation, set the verify to False to make a normal request.

r = Requests.get ("https://www.12306.cn/mormhweb/", verify = False)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Requests: Let HTTP serve humans __ Crawler tutorial

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Requests: Let HTTP serve humans __ Crawler tutorial

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support