Python3 web crawler Learning-Basic Library Usage (3)

Source: Internet
Author: User

In this section we mainly explain the handling of exceptions

When we send the request, sometimes the network is not good, there is an exception, the program due to error and terminate the operation, we need to deal with it

The error module in Urllib defines the exception generated by the request module, which describes the functions inside:

    • Urlerror

The class is inherited from the OSError class, is the base class of the error exception module, the exception generated by the request module can be caught by this class, it has a reason property

Import Urllib.error,urllib.requesttry:    response = Urllib.request.urlopen (' https://cuiqingcai.com/index.htm ') Except Urllib.error.URLError as E:    print (E.reason)

Normal program will error, but the final program output is not Found

It is worth noting that in the program run, if import is import urllib.error then use the back of the Urllib.error.URLError, if the import is from Urllib Impoort request, Error then you can directly request.urlopen,error.urlerror and so on.

    • Httperror

This is a subclass of Urlerror, specifically designed to handle HTTP request errors, such as failed authentication requests, and he has three properties:

Code: Returns an HTTP status code, such as 404 indicating that the Web page does not exist and 505 indicates a server internal error

Reason: Reason for return error

Headers: Returns the request header

The following example:

From urllib import request,errortry:    response = Request.urlopen (' https://cuiqingcai.com/index.htm ') except error. Httperror as E:    print (e.reason,e.code,e.headers)
===================== restart:f:\python\exercise\ok.py ===================


Not Found 404 server:nginx/1.10.3 (Ubuntu) Date:fri, 2018 03:14:05 gmtcontent-type:text/html; charset=utf-8transfer-encoding:chunkedconnection:closevary:cookieexpires:wed, Jan 1984 05:00:00 GMTCache-Control : No-cache, Must-revalidate, Max-age=0link:

Because Httperror is a subclass of Urlerror, you can choose to capture the subclass before capturing the error, then capturing the parent class's

From urllib import request,errortry:    response = Request.urlopen (' https://cuiqingcai.com/index.htm ') except error. Httperror as E:    print (e.reason,e.code,e.headers) except error. Urlerror as E:    print (E.reason) Else:    print (' Request successfully ')

Sometimes it is possible to use the socket library when reason returns a string or an object.

The socket is the network connection endpoint. For example, when your Web browser requests a home page on www.jb51.net , your Web browser creates a socket and commands it to connect to the www.jb51.net Web server host. The Web server also listens on a socket from a request. Both ends use their own sockets to send and receive information.

Import Socketimport urllib.requestimport urllib.errortry:    response = Urllib.request.urlopen (' https:// Www.baidu.com ', timeout = 0.01) except Urllib.error.URLError as E:    print (Type (E.reason))    if Isinstance ( e.reason,socket.timeout):        print (' Time Out ')
===================== restart:f:\python\exercise\ok.py ===================

<class ' Socket.timeout ' >
Time Out

Visible setting time-out causes the program to enforce a timeout exception (the type function does not consider a subclass to be a type of parent class, but Isinstance will consider it)

Python3 web crawler Learning-Basic Library Usage (3)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.