The checkpoint, in fact, is a record of past history, which can be thought of as log. But here it is simplified. For example, I am now a text. There are piles of links in the text. My task now is to download the contents of those addresses. In addition, because of network problems or problems with the website, Each download may not be very successful. There is a possibility of a broken chain or socket exception error. But no matter what kind of mistake I have, I want my program to go on and on. Or you can stop and continue running from the link for the download. Instead of running from where it started. The problem is very simple. Because these links are context-independent (context-sensitive scenarios to be analyzed separately). So I just want to keep track of the last line before the program is stopped, and hopefully it will be able to continue the work ahead. Here the implementation of the use of the original link is recorded, you can also use the counter method to record. The code is as follows:
# This exception is the original text content does not appear in the checkpoint content caused by the class Checkpointmisscontenterror: pass # to move the file read pointer fd to the content corresponding to the checkpoint # Check Point The rule is, read the file one line or more lines, after the operation, the line or multiple lines into the # Check file check_point. Run the program again at a later time to continue running from that checkpoint. def gocheckpoint (fd,check_point): if not Os.path.isfile (check_point): f_check = open (Check_point, ' W ') f_check.close () f_check = open (Check_point, ' r ') lines = F_check.readlines () If Len (lines) > 0: check_content = lines[-1] #找到检查点最后一行 check_content = Check_content.strip ('/n/r ') # go to check point< C14/>while True: content = Fd.readline () if content = = ': # EOF raise Checkpointmisscontenterror If Content.strip ('/n/r ') = = Check_content: Break
With the above paragraph is not enough, you need to add the following code:
# pseudo Code def Download (downloadlist,sleep_time): If Os.path.isfile (downloadlist): F = Open (downloadlist) # Check _point file name, here is an auto-generate checkpoint Check_point = File[0:file.rfind ('. ')] + ' _check.txt ' Util.gocheckpoint (f,check_point) #这就是上面代码中的GoCheckPoint函数 F_check = open (Check_point, ' a ') # is written in append mode Try:while true:content = f.readline () if content = = ': # EOF break C ontent = Content.strip ('/n/r ') if content! = ': # has download URL time.sleep (sleep_time) Downloadoper (Path,url) #这里是伪代码: You can think of the urllib.request.retrieve () function or the # of Urllib.request.urlopen () as the response action and then write the content to the checkpoint file F_check.write (cont ent+ '/n ') F_check.flush () # required, otherwise it will be cached and will not be written to the hard disk except: # Jumping an exception is not afraid, and then press F5 to execute again raise Exception () Return Util.failure # This is the constant that I set, which everyone thinks is 0 or 1 can be finally:f.close () F_check.close () # Close file print (' Downlo Ading is the done ..... .......... ') return util.success
After the operation is done, it is written to the checkpoint file. After the program hangs, as long as the checkpoint file is still in, you can continue the previous work. But the checkpoint here is a bit too simple relative to the checkpoint of the transaction in the database.