Seven main methods of the requests library
1. Requests.requests (method, URL, **kwargs)
Constructs a request that supports the underlying methods of the following methods
Method: The request method, corresponding to the Get/put/post seven methods;
URL: The URL link to get the page;
**kwargs: Control access parameters, a total of 13;
Method: Request Method
Get: A resource that requests a URL location;
Head: Gets the header information of the resource;
Post: The request to the URL location of the resource after the addition of new data;
PUT: The request stores a resource to the URL location, overwriting the resource at the original URL location;
PATCH: A resource that requests a local update of the URL location, which changes some of the contents of the resource;
Delete: request to delete the resource stored in the URL location;
**kwargs: Parameters that control access are optional
Params: A dictionary or sequence of bytes added to the URL as a parameter;
Data: dictionary, byte sequence or file object, asthe content of the Reque sts;
Json:json format data, as the content of requests;
Headers: dictionary, HTTP custom header;
Cookies: Cookies in dictionaries or cookiejar,requests;
Auth: Tuple, support HTTP authentication function;
Files: dictionary type, transfer files;
Timeout: Sets the timeout time, in seconds;
Proxies: The dictionary type, sets the Access Proxy server, may increase the registration authentication;
Allow_redirests:true/flase, the default is True, redirect switch;
Stream:true/false, the default is True, gets the content to download the switch immediately;
Verify:true/false, the default is True, authentication SSL certificate switch;
Cert: Local SSL certificate path;
2. Request.get (URL, params=none, **kwargs)
Gets the main method of the HTML page, corresponding to the HTTP get;
Params:url additional parameters, dictionary or byte stream format, optional;
**kwargs:12 a control access parameter;
3.requests.head (URL, **kwargs)
Get HTML page header information method, corresponding to the head of HTTP;
**kwarge:12 a control access parameter;
4.requests.post (URL, Data=none, Json=none, **kwargs)
The method of submitting a POST request to an HTML page, corresponding to the HTTP post;
data; dictionary, byte sequence or file, requests content;
Json:json format data, requests content;
**kwargs:12 a control access parameter;
5.requests.put (URL, Data=none, **kwargs)
Submit the put request method to the HTML page, corresponding to the HTTP put;
Data: dictionary, byte sequence or file, requests content;
**kwargs:12 a control access parameter;
6. Requests.patch (URL, data=none, **kwargs)
Submit the local repair request to the HTML page, corresponding to the HTTP patch;
Data: dictionary, byte sequence or file, requests content;
**kwagrs:12 a control access parameter;
7. Requests.delete (Uel, **kwagrs)
Submit the delete request to the HTML page, and delete the corresponding HTTP;
**kwagrs:12 the parameters of the access control;
two important objects of the requests library
response properties of an object
1. R.status_code
The return status of the HTTP request, 200 indicates success of the link, 404 or other indicates failure;
2. R.text
The string form of the HTTP response content, that is, the URL corresponding to the page content;
3. r.encoding
The encoding of the response content guessed from the HTTP header;
If CharSet is not present in the header, the encoding is considered Iso-8859-1,r.text to display the Web page content according to R.encoding;
4. r.apparent_encoding
The Response content encoding method (alternative coding method) is analyzed from the content;
5. R.content
The binary form of the HTTP response content;
exceptions to the requests library
1. Requests. Connectionerror
Network connection error exception, such as DNS query failure, reject links, etc.;
2. Requests. Httperror
HTTP error exception;
3. Requests. Urlrequired
URL missing exception;
4.requests. Toomanyredirects
Exceeding the maximum number of redirects, resulting in a redirect exception;
5. Requests. ConnectTimeout
Connection remote server timeout exception;
6. Requests. Timeout
Request URL timeout, resulting in timeout exception;
7.r.raise_for_status ()
If not 200, an abnormal requests is generated. Httperror;
Common code Framework for crawling Web pages
Import requestsdef gethtmltext (URL): try: r = requests.get (URL, timeout=30) r.raise_for_status () # RU If the state is not 200, throw Httperror exception r.encoding () = r.appearent_conding () return r.text except: return " Generate Exception "if __name__ = =" __name__ ": url =" www.baidu.com " print (Gethtmltext (URL))
Python Crawler requests Module