The Python remote host forces an existing connection socket to time out and sets errno 10054.

Last Update:2018-12-03 Source: Internet

Author: User

Tags connection reset

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Python socket. Error: [errno 10054] the remote host forces an existing connection to be closed. Solution:

I used python to read web pages a few days ago. Because a website uses a large number of urlopen operations, it will be identified as an attack by that website. Sometimes download is no longer allowed. As a result, request. Read () remains stuck there after urlopen. Errno 10054 will be thrown.

This error is caused by Connection reset by peer. That is, the remote host has reset the connection. The reason may be that the socket timeout is too long, or request = urllib. request. after urlopen (URL), no request is made. close () operation, or it may take a few seconds for the website to identify this behavior as an attack.

The specific solution is as follows:

Import socket Import time timeout = 20 socket. setdefatimetimeout (timeout) # Set the timeout time for the entire socket layer. If socket is used in subsequent files, you do not need to set sleep_download_time = 10 time. sleep (sleep_download_time) # set the time here request = urllib. request. urlopen (URL) # Here is the URL of the content to be read content = request. read () # Read. An exception is usually reported here. request. close () # Remember to close

Because the read () operation after urlopen actually calls some functions at the socket layer. Therefore, you can disable the network by setting the default socket timeout. You do not have to wait at read.

Of course, you can write a few more try and try t on the outer layer, for example:

Try: time. sleep (self. sleep_download_time) request = urllib. request. urlopen (URL) content = request. read () request. close () counter t unicodedecodeerror as E: Print ('----- unicodedecodeerror URL:', URL) upload t urllib. error. urlerror as E: Print ("----- urlerror URL:", URL) skip t socket. timeout as E: Print ("----- socket timout:", URL)

Generally, there is no problem. I tested the download of thousands of webpages and then said this. However, if you download tens of thousands of files, I did a test and MS will still jump out of this exception. It may be that the time. Sleep () is too short, or the network is suddenly interrupted. I tested urllib. Request. Retrieve () and found that data downloading continuously always fails.

A simple solution is as follows: first, refer to my article: Python checkpoint simple implementation
. Make a Check Point first. Then, run the exception Section Code while true. See the following pseudocode:

Def download_auto (downloadlist, fun, sleep_time = 15): while true: try: # outsource a layer of try value = fun (downloadlist, sleep_time) # Fun here is your download function. When the function pointer is passed in. # exit only after normal execution. If value = util. success: Break failed T: # If 10054 or ioerror or xxxerror occurs sleep_time + = 5 # Sleep for 5 seconds, execute the download. because of the checkpoint, the above program will continue to execute from where an exception is thrown. This prevents program interruptions caused by unstable network connections. Print ('enlarge sleep time: ', sleep_time)

However, if you cannot find the corresponding webpage, you need to do another thing:

# Print download information def reporthook (blocks_read, block_size, total_size): if not blocks_read: Print ('Connection opened ') If total_size <0: Print ('read % d blocks '% blocks_read) else: # if not found, the page does not exist. The totalsize may be 0 and the percentage cannot be calculated. Print ('downloading: % d MB, totalsize: % d mb' % (blocks_read * block_size/1048576.0, total_size/1048576.0) def download (path, URL): # url = 'Http: // downloads.sourceforge.net/sourceforge/alliancep2p/alliance-v1.0.6.jar' # filename = URL. rsplit ("/") [-1] try: # download function provided by python urllib. request. urlretrieve (URL, path, reporthook) failed t ioerror as E: # if it cannot be found, it may cause an ioerror. Print ("Download", URL, "/nerror:", e) Print ("done: % S/ncopy: % s "% (URL, PATH ))

If you still have problems... please comment on other solutions.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More