Python socket.error: [Errno 10054] The remote host forced the shutdown of an existing connection. Problem Solving Solution:
Read a Web page using python a few days ago. Because of a lot of use of a Web site Urlopen operation, so it will be identified as an attack by the site. Downloads are sometimes no longer allowed. led to Urlopen (), Request.read () had been stuck there. The end will throw errno 10054.
This error is connection reset by peer. The legendary remote host has reset this connection. The reason may be that the socket timeout is too long, or it may be after the request = Urllib.request.urlopen (URL), there is no request.close () operation, or there may be no sleep for a few seconds, causing the site to assume that this behavior is an attack.
Specific solutions such as the following code:
01.import socket
02.import time
03.timeout =
04.socket.setdefaulttimeout (timeout) # This sets the timeout time for the entire socket layer. If you use the socket again in subsequent files, you do not have to set
05.sleep_download_time =
06.time.sleep (sleep_download_time) #这里时间自己设定
07. Request = Urllib.request.urlopen (URL) #这里是要读取内容的url
08.content = Request.read () #读取, it is generally reported that the exception
09. Request.close () #记得要关闭
Because the read () operation after Urlopen is actually a call to some functions of the socket layer. So set the socket default timeout, you can let the network off itself. You don't have to wait at read ().
Of course, you can also write a few more try,except, such as:
Try:
time.sleep (self.sleep_download_time)
request = Urllib.request.urlopen (URL)
content = Request.read ()
request.close ()
except Unicodedecodeerror as E:
print ('-----unicodedecodeerror url: ', url)
Except Urllib.error.URLError as E:
print ("-----urlerror URL:", url)
except Socket.timeout as E:
print ("-- ---socket timout: ", url)
In general, there is no problem. I tested the downloads of thousands of pages before I could say that. But if you download tens of thousands, I do the next test, Ms will still jump out of this exception. The Time.sleep () may be too short, or it may be a sudden disruption of the network. I used the Urllib.request.retrieve () test, and found that the continuous downloading of data, there will always be failures.
The simple way to do this is to first refer to my article: The python checkpoint is simple to implement. Make a checkpoint first. Then the code that will run out of the exception is true. See the following pseudocode:
def download_auto (downloadlist,fun,sleep_time=15):
while True:
try: # Outsource a layer
of try value = Fun (Downloadlist, Sleep_time) # Here's fun is your download function, I'm passing in the function pointer.
# only normal performer can exit.
if value = = util.success: Break
except: # if 10054 or IOError or xxxerror
Sleep_time + + 5 #多睡5秒 occurs, Re-execute the above download. Because of the checkpoint, the above program will continue to execute from the place where the exception was thrown. Prevents the interruption of programs due to network connection instability.
print (' Enlarge sleep time: ', sleep_time)
However, to find the appropriate page, but also to do another deal:
# Print Download Information
def reporthook (Blocks_read, Block_size, total_size):
if not blocks_read:
print (' Connection Opened ')
if total_size < 0:
print (' Read%d blocks '% blocks_read)
else:
# If not found, the page does not exist, Maybe totalsize is 0 and cannot calculate percent
print (' downloading:%d MB, totalsize:%d mb '% (Blocks_read*block_size/1048576.0,total_ size/1048576.0)
def Download (path,url):
#url = ' http://downloads.sourceforge.net/sourceforge/ Alliancep2p/alliance-v1.0.6.jar '
#filename = Url.rsplit ("/") [-1]
try:
# Python's own download function
Urllib.request.urlretrieve (URL, path, reporthook)
except IOError as e: # If not found, it seems to cause ioerror.
print ("Download", URL, "/nerror:", e)
print ("Done:%s/ncopy to:%s"% (Url,path))
If people still encounter problems ... Please comment on the other solutions in the note.