In this section we mainly explain the handling of exceptions
When we send the request, sometimes the network is not good, there is an exception, the program due to error and terminate the operation, we need to deal with it
The error module in Urllib defines the exception generated by the request module, which describes the functions inside:
The class is inherited from the OSError class, is the base class of the error exception module, the exception generated by the request module can be caught by this class, it has a reason property
Import Urllib.error,urllib.requesttry: response = Urllib.request.urlopen (' https://cuiqingcai.com/index.htm ') Except Urllib.error.URLError as E: print (E.reason)
Normal program will error, but the final program output is not Found
It is worth noting that in the program run, if import is import urllib.error then use the back of the Urllib.error.URLError, if the import is from Urllib Impoort request, Error then you can directly request.urlopen,error.urlerror and so on.
This is a subclass of Urlerror, specifically designed to handle HTTP request errors, such as failed authentication requests, and he has three properties:
Code: Returns an HTTP status code, such as 404 indicating that the Web page does not exist and 505 indicates a server internal error
Reason: Reason for return error
Headers: Returns the request header
The following example:
From urllib import request,errortry: response = Request.urlopen (' https://cuiqingcai.com/index.htm ') except error. Httperror as E: print (e.reason,e.code,e.headers)
===================== restart:f:\python\exercise\ok.py ===================
Not Found 404 server:nginx/1.10.3 (Ubuntu) Date:fri, 2018 03:14:05 gmtcontent-type:text/html; charset=utf-8transfer-encoding:chunkedconnection:closevary:cookieexpires:wed, Jan 1984 05:00:00 GMTCache-Control : No-cache, Must-revalidate, Max-age=0link:
Because Httperror is a subclass of Urlerror, you can choose to capture the subclass before capturing the error, then capturing the parent class's
From urllib import request,errortry: response = Request.urlopen (' https://cuiqingcai.com/index.htm ') except error. Httperror as E: print (e.reason,e.code,e.headers) except error. Urlerror as E: print (E.reason) Else: print (' Request successfully ')
Sometimes it is possible to use the socket library when reason returns a string or an object.
The socket is the network connection endpoint. For example, when your Web browser requests a home page on www.jb51.net , your Web browser creates a socket and commands it to connect to the www.jb51.net Web server host. The Web server also listens on a socket from a request. Both ends use their own sockets to send and receive information.
Import Socketimport urllib.requestimport urllib.errortry: response = Urllib.request.urlopen (' https:// Www.baidu.com ', timeout = 0.01) except Urllib.error.URLError as E: print (Type (E.reason)) if Isinstance ( e.reason,socket.timeout): print (' Time Out ')
===================== restart:f:\python\exercise\ok.py ===================
<class ' Socket.timeout ' >
Time Out
Visible setting time-out causes the program to enforce a timeout exception (the type function does not consider a subclass to be a type of parent class, but Isinstance will consider it)
Python3 web crawler Learning-Basic Library Usage (3)