There are many convenient methods in the Reuqests library, such as getting a Web page in get, and in the requests library is the method get (), on the code
Import= requests.get ('https://www.baidu.com')print (Type (r)) Print (R.status_code) Print (Type (r.text)) Print (R.text) Print (r.cookies)
Equivalent to Urlopen method, get a response object, and then output his type, status code, the type of the corresponding body, content and cookies
Requests there are many other methods, such as post,put,delete,head,options, that indicate their request.
Because the most common in HTTP is a GET request, the following is used to build a GET request instance:
ImportRequestsdata= { 'name':'Germey', ' Age':' A' }#You can also add "? name=germey&age=22" to the URL without creating a data dictionary, but it's obviously a lot of trouble.R = Requests.get ('Http://httpbin.org/get', params =data)Print(R.text)#The Web page returns the STR type, JSON format, we can use JSON method to convert a JSON-formatted string into a dictionary
Print(R.json ()) Runfile ('f:/python/exercise/pygame/untitled0.py', wdir='F:/python/exercise/pygame'){ "args": { " Age":" A", "name":"Germey" }, "Headers": { "Accept":"*/*", "accept-encoding":"gzip, deflate", "Connection":"Close", "Host":"httpbin.org", "user-agent":"python-requests/2.18.4" }, "Origin":"182.108.3.27", "URL":"http://httpbin.org/get?name=germey&age=22"}{'args': {' Age':' A','name':'Germey'},'Headers': {'Accept':'*/*','accept-encoding':'gzip, deflate','Connection':'Close','Host':'httpbin.org','user-agent':'python-requests/2.18.4'},'Origin':'182.108.3.27','URL':'http://httpbin.org/get?name=germey&age=22'}
Note, however, that parsing errors json.decoder.JSONDecodeError errors if they are not in JSON format
In the crawl page, we crawl a page, that is, an HTML file, if you want to crawl pictures, audio, video and other files, you need to crawl their binaries, and then decode them to get
Import= Requests.get ("https://github.com/favicon.ico") Print(r.text)print(r.content)
We're like this when we open it.
It can be seen that there are two properties, one is the Text property, which is a Unicode type of file, which is the symbol set type, where the English alphabet is still in English, but the Chinese characters will be represented as garbled form, a content property, which is a binary form of the type, A b at the beginning indicates its file type
Add code to open the file again
With open ('facicon.ico','wb') as F: F.write ( R.content)
Where the first parameter of the Open function is the picture name, the second argument is opened in binary mode, and then it is found that an icon named Favicon.ico is stored under the current folder
But here I have a little doubt, why is not stored as a txt text file containing these binary code, I guess because, in the read and write files, the computer is now the binary encoding conversion, the file type of the transferred to ICO, as the string is saved as TXT, Then it should contain information about what kind of files are stored in this binary file.
Sometimes, some websites may forbid us to visit, this time add user-agent to be able, as long as in requests.get (' url ', headers= ') can be
2.post Request
import Requestsdata = {" name ": " germy " , Span style= "COLOR: #800000" > " age " : " 22 " }r = requests.post ( " https://httpbin.org/post ", Data=data) print (R.text) print (r.content)
{
"args": {},
"Data": "",
"Files": {},
"Form": {
"Age": "22",
"Name": "Germy"
},
"Headers": {
"Accept": "*/*",
"Accept-encoding": "gzip, deflate",
"Connection": "Close",
"Content-length": "17",
"Content-type": "application/x-www-form-urlencoded",
"Host": "httpbin.org",
"User-agent": "python-requests/2.18.4"
},
"JSON": null,
"Origin": "218.64.33.30",
"url": "Https://httpbin.org/post"
}
B ' {\ n ' args ': {}, \ n "Data": "", \ n "files": {}, \ n "form": {\ n "age": "\" \ n "name": "Germy" \ n}, \ n "headers": {\ n "Ac Cept ":" */* ", \ n" accept-encoding ":" gzip, deflate ", \ n" Connection ":" Close ", \ n" content-length ":" + ", \ n" Content-type ":" application/x-www-form-urlencoded ", \ n" Host ":" httpbin.org ", \ n" user-agent ":" python-requests/2.18.4 "\ n}, \ n" JSON ": null, \ n" origin ":" 218.64.33.30 ", \ n" url ":" Https://httpbin.org/post "\n}\n"
The result can be returned, where form is the submitted data, which proves that the post request was successfully submitted
You can get the status code via Requests.status_code, as well as history,url,cookies,headers properties
Import= Requests.get ("http://www.jianshu.com"if Else Print ("Request successfully")
Display as request successfully
Of course requests.codes have OK this condition code, we use this condition code can get corresponding status code 200, of course we can not only this one condition code, there are many
Python3 web crawler Learning-using requests (1)