1. Send the request:
Import requests
# Get Data
#r是一个 the Response object. Contains the content returned by the request
r = Requests.get ('https://github.com/timeline.json')
Print (r.content)
Printing results:
B ' {"message": "Hello there, Wayfaring stranger. If You\xe2\x80\x99re Reading this and you probably didn\xe2\x80\x99t the blog post a couple of years back announcing That this API would go Away:github API V2:end of Life fear isn't, you should being able to get what's need from the shiny n EW events API instead. "," Documentation_url ": Events | GitHub Developer Guide "}"
The 4 way to send a request is the 4 method in the HTTP protocol:
r = Requests.put ("http./httpbin.org/put")
r = Requests.delete ("http./httpbin.org/delete")
r = Requests.head ("http./httpbin.org/get")
r = requests.options ("http./httpbin.org/get")
2. Passing URL parameters
The following two methods are passed through the URL of the parameter. parameter, must be a dictionary
Import requests
Payload1 = {' Key1 ': ' value1 ', ' key2 ': ' value2 '}
R1 = requests.get ("http//Httpbin.org/get", Params=payload1)
Print (R1.url)
Payload2 = {' Key1 ': ' value1 ', ' key2 ': [' value2 ', ' Value3 ']}
r2 = requests.get ('http//Httpbin.org/get ', params=payload2)
Print (R2.url)
Corresponding results:
HTTP/httpbin.org/get? Key1=value1&key2=value2
HTTP/httpbin.org/get? Key1=value1&key2=value2&key2=value3
Watch the difference.
3. Response Content
r = Requests.get ('https://github.com/timeline.json')
#获取响应结果
Print (R.text)
#获取内容编码
Print (r.encoding)
#修改内容编码方式. The new encoding will be used when the text is modified and then taken
r.encoding = ' iso-8859-1 '
Note that the symbols are coded differently
Edit content into binary
i = Bytesio (r.content)
Convert content to JSON object
Print (R.json ())
Note: A successful call to R.json () does not imply a successful response. Some servers include a JSON object (such as the error details of HTTP 500) in a failed response. This JSON will be decoded back. To check if the request was successful, use R.raise_for_status () or check to see if the R.status_code is the same as your expectations
Original response Content
What is the original content? The client and the server side build the socket on that layer to retrieve the content. You need to set Stream=true to get back and return the object that is urllib.
r = Requests.get ('https://github.com/timeline.json', stream=true)
#取回流中的100个字节的内容
R.raw.read (100)
However, if you want to save the returned data as a file, you should use the flow as follows:
with open (filename, ' WB ') as FD:
For chunk in R.iter_content (chunk_size):
Fd.write (Chunk)
Replace R.raw with Response.iter_content
4. Customizing the request Header
url = 'https://api.github.com/some/endPoint '
headers = {' user-agent ': ' my-app/0.0.1 '}
#说白了就给url传参数
r = Requests.get (URL, headers=headers)
There are the following points to note:
Note: Custom headers have a lower priority than some specific sources of information, such as:
If the user authentication information is set in the. NETRC, the authorization that is set with the headers= will not take effect. If the auth= parameter is set, the ". Netrc" setting is invalid.
If it is redirected to another host, the authorization header is removed.
The proxy authorization header is overwritten by the proxy identity provided in the URL.
When we can judge the length of the content, the content-length of the header will be rewritten.
Further, requests does not change its behavior based on the specifics of the custom header. Only in the final request, all header information will be passed in.
Note: All header values must be string, bytestring, or Unicode.
5. More complex POST requests
Import requests
# Pass-through tuples
Payload1 = ((' Key1 ', ' value1 '), (' Key1 ', ' value2 '))
R1 = requests.post ('http//Httpbin.org/post ', data=payload1)
# Pass-Through dictionary
Payload2 = {' Key1 ': ' value1 ', ' key2 ': ' value2 '}
r2 = requests.post ("http//Httpbin.org/post", data=payload2)
# Passing JSON strings
URL1 = 'https://api.github.com/some/endPoint '
Payload3 = {' Some ': ' Data '}
R3 = Requests.post (URL1, Data=json.dumps (PAYLOAD3))
# Passing JSON objects
Url2 = 'https://api.github.com/some/endPoint '
Payload4 = {' Some ': ' Data '}
R4 = Requests.post (Url2, JSON=PAYLOAD4)
6. File transfer
Import requests
url = '/httphttpbin.org/post'
# files = {' file ': Open (' Report.xls ', ' RB ')}
# explicitly set filename, file type and request header
# files = {' file ': (' Report.xls ', open (' Report.xls ', ' RB '), ' application/vnd.ms-excel ', {' Expires ': ' 0 '})}
# Send the string as a file
Files = {' file ': (' Report.xls ', ' some,data,to,send\nanother,row,to,send\n ')}
r = Requests.post (URL, files=files)
Print (R.text)
3rd Step Response Result
Note: The official recommendation is to use Requests-toolbelt to send multiple files. We'll show you further later
7. Response Status Code
r = Requests.get ('http//Httpbin.org/get ')
Print (R.status_code)
# Status Query object: Requests.codes
Print (R.status_code = = Requests.codes.ok)
Bad_r = Requests.get ('http//httpbin.org/status/404 ')
Print (Bad_r.status_code)
# When the request is in question, the Raise_for_status () method will start the exception manually
Bad_r.raise_for_status ()
Execution Result:
8. Response header
Import requests
r = Requests.get ('http//Httpbin.org/get ')
Print (R.status_code)
#获取响应头. The response header is a dictionary
Print (r.headers)
Print (r.headers[' Content-type ')
Print (R.headers.get (' Content-type '))
9.Cookie
Import requests
url = ' Http://example.com/some/cookie/setting/url '
r = Requests.get (URL)
# Get the cookies returned by the request
r.cookies[' Example_cookie_name ']
url = '/httphttpbin.org/cookies'
# bring your requests with cookies this thing is often used after a simulated login
r = Requests.get (URL, cookies=cookies)
R.text
# The return object of the Cookie is Requestscookiejar, which behaves like a dictionary and is suitable for cross-path use across domains
#妹的, is this cross-domain? It's an imitation. Login free
jar = Requests.cookies.RequestsCookieJar ()
Jar.set (' Tasty_cookie ', ' yum ', domain= ' httpbin.org ', path= '/cookies ')
Jar.set (' Gross_cookie ', ' Blech ', domain= ' httpbin.org ', path= '/elsewhere ')
url = '/httphttpbin.org/cookies'
r = Requests.get (URL, Cookies=jar)
R.text
10. Redirection and request history
By default, requests automatically handles all redirects except HEAD. You can use the history method of the response object to track the redirection.
What is redirection: Enter a address but automatically jump to B address
The following example: The 301 put back represents a permanent redirect. Don't dwell too much on it, just remember it.
It needs to be understood here: this example clearly accesses an address and why it is redirected. Because the domain name is accessed, DNS automatically turns to the actual server, where it redirects
Response.history is a list of Response objects that were created to complete the request. This list of objects is sorted by the most recent request.
r = Requests.get ('http//github.com ')
Print (R.url)
Print (r.history)
Disable redirection:
Using GET, OPTIONS, POST, PUT, PATCH, or DELETE, you can disable redirection with the allow_redirects parameter
r = Requests.get ('/http/github.com', Allow_redirects=false)
Print (R.status_code)
Print (r.history)
Use head to initiate redirection:
r = Requests.head ('/http/github.com', allow_redirects=true)
Print (r.history)
11. Timeout
R=requests.get ('http//github.com ', timeout=0.001)
Timeout: is very useful. If you do not set a timeout and do not return for a long time, the program will block. Timeout is only valid for the connection process and is not related to the download of the response body. Timeout is not a time limit for the entire download response, but if the server does not answer within timeout seconds, an exception will be thrown (more precisely, when no byte data is received from the underlying socket in timeout seconds)
12. Errors and exceptions
Requests throws a Connectionerror exception when encountering network problems such as DNS query failure, connection rejection, and so on.
If an HTTP request returns an unsuccessful status code, Response.raise_for_status () throws a Httperror exception.
If the request times out, a timeout exception is thrown.
If the request exceeds the maximum number of redirects set, a Toomanyredirects exception is thrown.
All exceptions that are explicitly thrown by requests inherit from Requests.exceptions.RequestException
Until now, we have a basic understanding of requests. Tomorrow, we will discuss further the requests advanced tricks.
I just hope that the company's new colleagues, Niu Mei can take a moment to look carefully, the code to run, see what effect.
Python crawler series (ii): Requests Basics