Python requests Quick Start, pythonrequests

Source: Internet
Author: User

Python requests Quick Start, pythonrequests

Quick Start

Can't wait? This section provides good guidance on how to get started with Requests. Assume that you have installed Requests. If not, go to the installation section.

First, confirm:

Requests installed

Requests is the latest

Let's start with some simple examples.

Send request

It is very easy to send network Requests using Requests.

To import the Requests module at the beginning:

>>> import requests

Then, try to get a webpage. In this example, we will get the public timeline of Github:

>>> r = requests.get('https://github.com/timeline.json')

Now we have a Response object named r. We can get all the information we want from this object.

The simple Requests API means that all HTTP request types are obvious. For example, you can send an http post request as follows:

>>> r = requests.post(http://httpbin.org/post)

Pretty, right? What about other HTTP request types: PUT, DELETE, HEAD, and OPTIONS? They are all the same and simple:

>>> r = requests.put("http://httpbin.org/put")>>> r = requests.delete("http://httpbin.org/delete")>>> r = requests.head("http://httpbin.org/get")>>> r = requests.options(http://httpbin.org/get)

They are all good, but this is only the tip of the iceberg of Requests.

Pass URL parameters

You may often want to pass some data for the query string of a URL. If you create a URL manually, the data will be placed in the URL in the form of a key/value pair, followed by a question mark. For example, httpbin.org/get? Key = val. Requests allows you to use the params keyword parameters to provide these parameters in a string dictionary. For example, if you want to pass key1 = value1 and key2 = value2 to httpbin.org/get, you can use the following code:

>>> payload = {'key1': 'value1', 'key2': 'value2'}>>> r = requests.get("http://httpbin.org/get", params=payload)

By printing and outputting the URL, you can see that the URL has been correctly encoded:

>>> print(r.url)http://httpbin.org/get?key2=value2&key1=value1

Note that keys with the value of None in the dictionary are not added to the query string of the URL.

You can also pass in a list as a value:

>>> payload = {'key1': 'value1', 'key2': ['value2', 'value3']}>>> r = requests.get('http://httpbin.org/get', params=payload)>>> print(r.url)http://httpbin.org/get?key1=value1&key2=value2&key2=value3

Response content

We can read the server response content. Take the GitHub timeline as an example:

>>> import requests>>> r = requests.get('https://github.com/timeline.json')>>> r.textu'[{"repository":{"open_issues":0,"url":"https://github.com/...

Requests automatically decodes content from the server. Most unicode character sets can be decoded seamlessly.

After a request is sent, Requests will make a reasonable estimation of the response Encoding Based on the HTTP header. When you access r. text, Requests uses its speculative text encoding. You can find out what encoding Requests uses and use the r. encoding attribute to change it:

>>> r.encoding'utf-8'>>> r.encoding = 'ISO-8859-1'

If you change the encoding, the Request will use the new value of r. encoding whenever you access r. text. You may want to modify the encoding when using the special logic to calculate the text encoding. For example, HTTP and XML can specify encoding. In this case, you should use r. content to find the encoding, and then set r. encoding to the corresponding encoding. In this way, you can use the correct encoding to parse the r. text.

You can also use custom encoding for Requests as needed. If you have created your own code and registered it using the codecs module, you can easily use this decoder name as the value of r. encoding and then use Requests to process the encoding for you.

Binary Response content

You can also access the request response body in bytes. For non-text requests:

>>> r.contentb'[{"repository":{"open_issues":0,"url":"https://github.com/...

Requests will automatically decode the response data of gzip and deflate.

For example, to create an image with the binary data returned by a request, you can use the following code:

>>> from PIL import Image>>> from io import BytesIO>>> i = Image.open(BytesIO(r.content))

JSON response content

Requests also has a built-in JSON decoder to help you process JSON data:

>>> import requests>>> r = requests.get('https://github.com/timeline.json')>>> r.json()[{u'repository': {u'open_issues': 0, u'url': 'https://github.com/...

If JSON decoding fails, r. json () throws an exception. For example, if the response content is 401 (Unauthorized) and an attempt to access r. json () will throw a ValueError: No JSON object cocould be decoded exception.

Note that successfully calling r. json () and ** no ** means the response is successful. Some servers include a JSON object (for example, HTTP 500 Error details) in the failed response ). This JSON will be decoded and returned. To check whether the request is successful, use r. raise_for_status () or check whether r. status_code is the same as your expectation.

Original response content

In rare cases, if you want to obtain the original socket response from the server, you can access r. raw. If you really want to do this, make sure that stream = True is set in the initial request. You can do this:

>>> r = requests.get('https://github.com/timeline.json', stream=True)>>> r.raw<requests.packages.urllib3.response.HTTPResponse object at 0x101194810>>>> r.raw.read(10)'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'

However, you should save the text stream to a file in the following mode:

with open(filename, 'wb') as fd: for chunk in r.iter_content(chunk_size):  fd.write(chunk)

Using Response. iter_content will process a large number of things you have to deal with directly using Response. raw. When downloading a stream, the above is the preferred way to obtain the content. Note that chunk_size can be freely adjusted to a number that may better fit your use cases.

Custom Request Header

If you want to add an HTTP header for the request, simply pass a dict to the headers parameter.

For example, in the previous example, we did not specify content-type:

>>> Url = 'https: // api.github.com/some/endpoint'
>>> Headers = {'user-agent': 'My-app/0.0.1 '}
>>> R = requests. get (url, headers = headers)

Note: the priority of custom headers is lower than that of some specific information sources, for example:

If user authentication information is set in. netrc, authorization with headers = does not take effect. If the auth = parameter is set, the '. netrc' setting is invalid.

If you are redirected to another host, the Authorization header will be deleted.

The proxy Authorization header is overwritten by the proxy identity provided in the URL.

When we can determine the Content Length, the Content-Length of the header will be rewritten.

Further, Requests will not change its behavior based on the specific situation of custom headers. However, in the final request, all header information will be passed in.

Note: All header values must be string, bytestring, or unicode. It is not recommended that you pass unicode headers.

More complex POST requests

Generally, you want to send data encoded as forms-very similar to an HTML form. To achieve this, you only need to pass a dictionary to the data parameter. Your data dictionary is automatically encoded as a form when a request is sent:

>>> Payload = {'key1': 'value1', 'key2': 'value2 '}

>>> R = requests. post ("http://httpbin.org/post", data = payload)
>>> Print (r. text)
{
...
"Form ":{
"Key2": "value2 ",
"Key1": "value1"
},
...
}

You can also input a list of tuples for the data parameter. This method is particularly effective when multiple elements in a form use the same key:

>>> Payload = ('key1', 'value1'), ('key1', 'value2 '))
>>> R = requests. post ('HTTP: // httpbin.org/post', data = payload)
>>> Print (r. text)
{
...
"Form ":{
"Key1 ":[
"Value1 ",
"Value2"
]
},
...
}

Most of the time, the data you want to send is not encoded as a form. If you pass a string instead of a dict, the data will be directly published.

For example, Github API v3 accepts POST/PATCH data encoded as JSON:

>>> Import json
>>> Url = 'https: // api.github.com/some/endpoint'
>>> Payload = {'some': 'data '}
>>> R = requests. post (url, data = json. dumps (payload ))

In addition to encoding dict, you can use json parameters to directly pass the code, and then it will be automatically encoded. This is the new feature of version 2.4.2:

>>> Url = 'https: // api.github.com/some/endpoint'
>>> Payload = {'some': 'data '}
>>> R = requests. post (url, json = payload)

POST a Multipart-Encoded file

Requests makes it easy to upload multiple encoded files:

>>> Url = 'HTTP: // httpbin.org/post'
>>> Files = {'file': open('report.xls ', 'rb ')}
>>> R = requests. post (url, files = files)
>>> R. text
{
...
"Files ":{
"File": "<censored... binary... data>"
},
...
}

You can explicitly set the file name, file type, and request header:

>>> Url = 'HTTP: // httpbin.org/post'
>>> Files = {'file': ('report.xls ', open('report.xls', 'rb'), 'application/vnd. ms-excel ', {'expires': '0 '})}
>>> R = requests. post (url, files = files)
>>> R. text
{
...
"Files ":{
"File": "<censored... binary... data>"
},
...
}

If you want to, you can also send strings received as files:

>>> Url = 'HTTP: // httpbin.org/post'
>>> Files = {'file': ('report.csv ', 'some, data, to, send \ nanother, row, to, send \ n ')}

>>> R = requests. post (url, files = files)
>>> R. text
{
...
"Files ":{
"File": "some, data, to, send \ nanother, row, to, send \ n"
},
...
}

If you send a very large file as a multipart/form-data request, you may want to make the request into a data stream. Requests is not supported by default, but a third-party package requests-toolbelt is supported. You can read the toolbelt document to learn how to use it.

For more information about how to send multiple files in a request, see advanced usage.

Warning

We strongly recommend that you open a file in binary mode. This is because Requests may try to provide you with the Content-Length header. In this case, this value will be set to the number of bytes (bytes) of the file ). If you open a file in text mode, an error may occur.

Response status code

We can check the response status code:

>>> R = requests. get ('HTTP: // httpbin.org/get ')
>>> R. status_code
200

For reference convenience, Requests also comes with a built-in status code query object:

>>> R. status_code = requests. codes. OK
True

If an error request is sent (A 4XX client error or a 5XX server error Response), we can throw an exception through Response. raise_for_status:

>>> Bad_r = requests. get ('HTTP: // httpbin.org/status/404 ')
>>> Bad_r.status_code
404
>>> Bad_r.raise_for_status ()
Traceback (most recent call last ):
File "requests/models. py", line 832, in raise_for_status
Raise http_error
Requests. exceptions. HTTPError: 404 Client Error

However, since the status_code of r in our example is 200, when we call raise_for_status (), the result is:

>>> R. raise_for_status ()
None

Everything is harmonious.

Response Header

We can view the Server Response Headers displayed in a Python dictionary:

>>> R. headers
{
'Content-encoding': 'gzip ',
'Transfer-encoding': 'chunk ',
'Connection': 'close ',
'Server': 'nginx/1.0.4 ',
'X-runtime': '148ms ',
'Etag': '"e1ca502697e5c9317743dc078f67693f "',
'Content-type': 'application/json'
}

However, this dictionary is special: it is generated only for the HTTP header. According to RFC 2616, the HTTP header is case-insensitive.

Therefore, we can access these response header fields in any capital format:

>>> R. headers ['content-type']
'Application/json'
>>> R. headers. get ('content-type ')
'Application/json'

It also has a special point, that is, the server can accept the same header multiple times and use different values each time. However, Requests merges them so that they can be expressed by a ing. For more information, see RFC 7230:

A recipient MAY combine multiple header fields with the same field name into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field value to the combined field value in order, separated by a comma.

The receiver can combine multiple header fields with the same name into a "field-name: field-value" pair, append each subsequent field value to the merged column value in sequence and separate it with commas. This will not change the meaning of the information.

Cookie

If a response contains cookies, you can quickly access them:

>>> Url = 'HTTP: // example.com/some/cookie/setting/url'
>>> R = requests. get (url)
>>> R. cookies ['example _ cookie_name ']
'Example _ cookie_value'
To send your cookies to the server, you can use the cookies parameter:

>>> Url = 'HTTP: // httpbin.org/cookies'
>>> Cookies = dict (cookies_are = 'working ')

>>> R = requests. get (url, cookies = cookies)
>>> R. text
'{"Cookies": {"cookies_are": "working "}}'

The returned object of the Cookie is RequestsCookieJar. Its behavior is similar to that of the dictionary, but the interface is more complete and is suitable for cross-domain Cross-path use. You can also upload the Cookie Jar to Requests:

>>> jar = requests.cookies.RequestsCookieJar()>>> jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies')>>> jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere')>>> url = 'http://httpbin.org/cookies'>>> r = requests.get(url, cookies=jar)>>> r.text'{"cookies": {"tasty_cookie": "yum"}}'

Redirection and request history

By default, Requests will automatically process all redirects except HEAD.

You can use the history method of the response object to track redirection.

Response. history is a list of Response objects. These objects are created to complete the request. This object list is sorted by the requests from the oldest to the nearest.

For example, Github redirects all HTTP requests to HTTPS:

>>> r = requests.get('http://github.com')>>> r.url'https://github.com/'>>> r.status_code200>>> r.history[<Response [301]>]

If you are using GET, OPTIONS, POST, PUT, PATCH, or DELETE, you can disable redirection using the allow_redirects parameter:

>>> r = requests.get('http://github.com', allow_redirects=False)>>> r.status_code301>>> r.history[]

If you use HEAD, you can also enable redirection:

>>> r = requests.head('http://github.com', allow_redirects=True)>>> r.url'https://github.com/'>>> r.history[<Response [301]>]

Timeout

You can tell requests to stop waiting for response after the time specified by the timeout parameter. Basically all production code should use this parameter. If you do not use it, your program may lose response forever:

>>> requests.get('http://github.com', timeout=0.001)Traceback (most recent call last): File "<stdin>", line 1, in <module>requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001)

Note:

Timeout is only valid for the connection process and is irrelevant to the download of the response body. Timeout is not the time limit for the entire download response, but if the server does not respond within timeout seconds, an exception will be thrown (more accurately, if no timeout is specified explicitly, requests do not time out.

Errors and exceptions

When a network problem (such as DNS query failure or connection rejection) occurs, Requests throws a ConnectionError exception.

If the HTTP request returns an unsuccessful status code, Response. raise_for_status () throws an HTTPError exception.

If the request times out, a Timeout exception is thrown.

If the request exceeds the set maximum number of redirects, A TooManyRedirects exception is thrown.

All exceptions explicitly thrown by Requests are inherited from requests. exceptions. RequestException.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.