There are MD5 ways to do differential backups, but this MD5 way to write the following questions:
md5sum There is a problem with the MD5 value of some soft joins
• Backup of empty directories is not supported because md5sum cannot get MD5 values for empty directories
• Permission Modification Md5sum Unable to judge
Solution:
using the Mtime CTime of the file
Mtime (Modified time) is changed when writing files with changes to the contents of the file
CTime (Create time) is changed with the contents of the Inode when writing the file, changing the owner, permissions, or link settings
Nonsense not much to say directly on the code:
#!/usr/bin/env python import Time,os,sys,cpickle fileInfo = {} def logger (time,filename,status,filenum): F = open (' Ba Ckup.log ', ' a ') f.write ("%s\t%s\t%s\t\t%s\n"% (time,filename,status,filenum)) def tar (sdir,ddir,filenum): Command = " Tar zcf%s%s >/dev/null 2>&1 "% (Ddir +". tar.gz ", Sdir) if Os.system (command) = = 0:logger (Time.strftime (
'%F%x '), Ddir + ". tar.gz", ' success ', FileNum) Else:logger (Time.strftime ('%F%x '), Ddir + ". tar.gz", ' failed ', FileNum) def fullbak (path): FileNum = 0 for root,dirs,files in Os.walk (path): for name in files:file = Os.path.join (Root, name) Mtime = os.path.getmtime (file) CTime = os.path.getctime (file) fileinfo[file] = (mtime,ctime FileNum + + + 1 f = open (P, ' W ') Cpickle.dump (fileinfo,f) f.close () tar (s,d,filenum) def diffbak (path): for Root,dirs,files in Os.walk (path): for name in Files:file = Os.path.join (root,name) mtime = os.path.getmt IME (file) CTime =Os.path.getctime (file) fileinfo[file] = (mtime,ctime) if Os.path.isfile (p) = = 0:f = Open (P, ' W ') F.close ( If Os.stat (p). st_size = = 0:f = Open (P, ' W ') Cpickle.dump (fileinfo,f) FileNum = Len (Fileinfo.keys ()) F
. Close () print FileNum tar (s,d,filenum) else:f = open (P) Old_fileinfo = Cpickle.load (f) f.close ()
difference = Dict (set (Fileinfo.items ()) ^set (Old_fileinfo.items ()) FileNum = Len (difference) Print FileNum Difference_file = '. Join (Difference.keys ()) Print Difference_file tar (difference_file,d,filenum) f = open ( P, ' W ') Cpickle.dump (fileinfo,f) f.close () def Usage (): print "' Syntax:python file_bakcup.py pickle_file Model Source_dir FILENAME_BK model:1:full backup 2:differential backup Example:python file_backup.py FileInfo
. PK 2/etc etc_$ (date +%f) explain:automatically add '. tar.gz ' suffix ' ' sys.exit () If Len (SYS.ARGV)!= 5: Usage () P = sys.argv[1]
M = Int (sys.argv[2]) S = sys.argv[3] D = sys.argv[4] if M = = 1:fullbak (s) elif M = = 2:diffbak (s) else:print " 33[;31mdoes not support this mode\033[0m "Usage" ()
Test:
$ python file_backup.py data.pk 1 data data_$ (date +%f) #全备份
$ > Data/www.jb51.net #测试创建文件, modify file permissions
$ chmod 777 D Ata/py/eshop_bk/data.db
$ python file_backup.py data.pk 2 data data_$ (date +%f) _1 #备份改变的文件
2
data/py/eshop _bk/data.db data/www.jb51.net
Read the blogger's code, very enlightening, but there is a problem, if I complete the full backup, delete one of the files, and then do a differential backup, can detect the deleted file, but the execution of tar will be wrong, because this file is no longer exist, so before the execution of tar, It is best to use os.path.exists () to determine whether the difference file path exists, if it does not exist, do not perform tar, feedback a file to delete information.