Python socket.error: [Errno 10054] The remote host forced the shutdown of an existing connection. Problem Solving Solution:
Read a Web page using python a few days ago. Because of a lot of use of a Web site Urlopen operation, so it will be identified as an attack by the site. Downloads are sometimes no longer allowed. led to Urlopen (), Request.read () had been stuck there. The end will throw errno 10054.
This error is connection reset by peer. The legendary remote host has reset this connection. The reason may be that the socket timeout is too long, or it may be after the request = Urllib.request.urlopen (URL), there is no request.close () operation, or there may be no sleep for a few seconds, causing the site to assume that this behavior is an attack.
Specific solutions such as the following code:
Import Socket Import Time timeout = socket.setdefaulttimeout (timeout) #这里对整个socket层设置超时时间. If you use the socket again in subsequent files, you do not have to set sleep_download_time = Time.sleep (sleep_download_time) #这里时间自己设定 request = Urllib.request.urlopen (URL) #这里是要读取内容的url content = Request.read () #读取, the exception is usually reported here Request.close () #记得要关闭
Because the read () operation after Urlopen is actually a call to some functions of the socket layer. So set the socket default timeout, you can let the network off itself. You don't have to wait at read ().
Of course, you can also write a few more try,except, such as:
Try:time.sleep (self.sleep_download_time) request = Urllib.request.urlopen (URL) content = Request.read () request.close () except Unicodedecodeerror as E:print ('-----unicodedecodeerror url: ', url) except Urllib.error.URLError as E:print ("- ---urlerror URL: ", url) except Socket.timeout as E:print ("-----Socket timout: ", url)
In general, there is no problem. I tested the downloads of thousands of pages before I could say that. But if you download tens of thousands, I do the next test, Ms will still jump out of this exception. The Time.sleep () may be too short, or it may be a sudden disruption of the network. I used the Urllib.request.retrieve () test, and found that the continuous downloading of data, there will always be failures.
The simple way to do this is to first refer to my article: The python checkpoint is simple to implement. Make a checkpoint first. Then the code that will run out of the exception is true. See the following pseudocode:
def download_auto (downloadlist,fun,sleep_time=15): While True:try: # Outsource One-layer try value = Fun (downloadlist,sleep_time) # Here's F UN is your download function, and I'm passing it in as a function pointer. # only normal performer can exit. if value = = Util.SUCCESS:break except: # if 10054 or IOError or xxxerror Sleep_time + + 5 #多睡5秒 occur, redo the download above. Because of the checkpoint, The above program will continue to execute from the place where the exception was thrown. Prevents the interruption of programs due to network connection instability. Print (' Enlarge sleep time: ', sleep_time)
However, to find the appropriate page, but also to do another deal:
# Print Download Information def reporthook (Blocks_read, Block_size, total_size): If not Blocks_read:print (' Connection opened ') if Total_siz E < 0:print (' Read%d blocks '% blocks_read) Else: # If not found, page does not exist, may be totalsize is 0, cannot calculate percent print (' downloading:%d MB, Tota lsize:%d MB '% (blocks_read*block_size/1048576.0,total_size/1048576.0)) def Download (path,url): #url = ' http:// Downloads.sourceforge.net/sourceforge/alliancep2p/alliance-v1.0.6.jar ' #filename = Url.rsplit ("/") [-1] Try: # Python's own download function urllib.request.urlretrieve (URL, path, reporthook) except IOError as E: # If not found, it seems to cause ioerror. Print ("Download", URL, "/nerror:", e) print ("Done:%s/ncopy to:%s"% (Url,path))
If people still encounter problems ... Please comment on the other solutions in the note.