Dataquest is a good website, the inside of the course short refining, very suitable for beginners quick start, and to establish a sound knowledge system.
I was in a treasure to buy the monthly rent dataquest, one months also only 90 yuan, the link is as follows: https://item.taobao.com/item.htm?id=564528265057
This article is to learn working with API and Web Scrapy notes;
The API of the website is put on the server, when the client requests the API from the server, the server will have a JSON data, in Python, we generally use the requests class
After the request is issued, a status code is returned, and the common status code is as follows:
200-everything goes well, the server returns the results (if any).
301-The server redirects you to a different endpoint. This can happen when the company switches the domain name or the name of the terminal is changed.
401-The server thinks you are not authenticated. This happens when you do not send the correct certificate to access an API (we will discuss the issue in a later task).
400-The server thinks you made the wrong request. This can happen when you do not send the information that the API needs to process the request.
403-The resource you are trying to access is forbidden; You do not have the correct permissions to view it.
404-the server did not find the resource you are trying to access
Some APIs need to pass in parameters, and the server returns a 400 if no parameters are passed in.
There are two ways to pass a front-end parameter:
* * params parameter * * URL Pass
The parameters returned by the backend, if JSON, can be obtained in two ways:
* * Request.json () * * JSON module Json.loads (String) json.dumps (dict)
The backend returns header data, which can be obtained through reponse.headers, a dictionary such as: headers[' Content-type ']
ImportRequestsImportJSON#response = Requests.get ("Http://api.open-notify.org/iss-now.json")#you need to pass in a parameter, which is a dictionary, and the dictionary contains the latitude and longitude dataparams= { 'lat': 29.35, 'Lon': 106.33,}#using params to pass parametersResponse = Requests.get ("Http://api.open-notify.org/iss-pass.json", params=params)#using URL-pass parametersResponse = Requests.get ("http://api.open-notify.org/iss-pass.json?lat=29.35&lon=106.33") Content=Response.content.decode () Status_code=Response.status_code#use the Dumps method to get the JSON data for the server as Dict#content_dict = json.loads (content)#get directly using Response.json ()Content_dict =Response.json ()Print(content_dict)#Gets the information that contains how this information is generated and how it is decoded, that is, headers, the header is a dictionaryHeaders=response.headersPrint(headers['Content-type'])
Most APIs do not require validation, but more often, the access API needs to be validated.
The API uses rate limiting, speed limit. Prevent requests from getting too fast
This article will do a walkthrough using GitHub's API.
--Authorization--access token
GitHub using Access Token,access token is a string that can be obtained via the official website.
we can put access tokens in the header, passing in request requests
--Paginate
Sometimes, a request is too much information and may take a lot of time. The backend will typically use pagination
In this case, the background returns only one page of data at a time instead of all the data, so to access all the data,
The access process needs to be put into a for loop.
--Post
In addition to the GET request, there are other requests, such as post for sending data.
GitHub can use a POST request to create a repositories, read the API documentation for the background to accept post requests
after the POST succeeds, it returns a 201 Status_code
--PATCH
PATCH: Use when you want to change some properties of an object without sending an entire object
successfully returned to
--PUT
PUT is to change all the information about an object, we send the entire object's information
--DELETE
deleting objects on the Web server side
the successful words return 204
Import requests# 1. Add authentication information to the header, Access tokenheaders = {' Authorization ': "token 1f36137fbbe1602f779300dad26e4c1b7fbab631"}response = Requests.get ("Https://api.github.com/users/VikParuchuri/orgs", headers=headers) resp = Response.json () print (RESP) # 2. Use Paginate to get a different page number params = {' per_page ': ' page ': 2}response = Requests.get (' Https://api.github.com/users/VikP Aruchuri/starred ', Headers=headers, params=params) Page2_repos = Response.json () # 3. Use post to send data to Web server payload = {"name": "Learning-about-apis"}response = Requests.post ("https://api.github.com/user/ Repos ", Json=payload, headers=headers) Status_code = Response.status_codeprint (status_code) # 4. Use patch request to change part of information, put request payload = {"description": "Learning about Requests!", "name": "Test"}response = Requests.patch ("HT Tps://api.github.com/repos/vikparuchuri/learning-about-apis ", Json=payload, headers=headers) status = Response.status_codeprint (status) # 5. Delete request response = Requests.delete ("Https://api.github.com/repos/ViKparuchuri/test ", headers=headers) print (response.status_code) response = Requests.delete (" https://api.github.com/ Repos/vikparuchuri/learning-about-apis ", headers=headers) status = Response.status_code
Dataquest Learning Notes Python Web scrawl working with API