Goal:
1. Pass in 3 parameters: source file path, destination file path, MD5 file
2. Every Monday to achieve full-scale backup, the rest of the time incremental backup
1. Get all directories and files under the path (recursive) through the path passed in
Method One: Use Os.listdir
The code is as Follows:
#!/usr/bin/env python#Coding:utf8ImportOs,sysdefLsdir (folder): Contents=Os.listdir (folder)Print "%s\n%s\n"%(folder, contents) forPathinchContents:full_path=os.path.join (folder, Path)ifos.path.isdir (full_path): lsdir (full_path)if __name__=="__main__": Lsdir (sys.argv[1])
? Run the code with the following effect:
[[email protected] python] # python listdir.py/a/a['b'a.txt'] /a/b['C'b.txt']/ a/b/c['c.txt']
Method Two: Use Os.walk
The code is as Follows:
#!/usr/bin/env python#-*-coding:utf-8-*-ImportOs,sysdefLsdir (folder): Contents=Os.walk (folder) forpath, folder, Fileinchcontents:Print "%s\n%s\n"% (path, folder +File)if __name__=="__main__": Lsdir (sys.argv[1])
? run code, Test effect
[[email protected] python] # python listdir1.py/a/a['b'a.txt'] /a/b['C'b.txt']/ a/b/c['c.txt']
2. How to calculate the MD5 value of the file (4K per read, until all the contents of the file are read, return a 16 binary MD5 Value)
The code is as Follows:
[email protected] python]# Cat md5.py
#!/usr/bin/env python#-*-coding:utf-8-*-ImportHashlibImportSYSdefMD5 (fname): m=hashlib.md5 () with open (fname) as Fobj: whileTrue:data= Fobj.read (4096) if notData: breakm.update (data)returnm.hexdigest ()if __name__=="__main__": PrintMD5 (sys.argv[1])
? run code, Test effect
[[email protected] python] # python md5.py a.txt c33da92372e700f98b006dfa5325cf0d[[email protected] python] # md5sum a.txtc33da92372e700f98b006dfa5325cf0d a.txt
* Hint: Use the Linux md5sum to communicate with your own python-calculated MD5 values
3. Write full-volume and incremental backup scripts
The code is as Follows:
#!/usr/bin/env python#Coding:utf8Import timeImportOSImportTarfileImportCpickle as PImportHashlibdefMd5check (fname): m=hashlib.md5 () with open (fname) as Fobj: whileTrue:data= Fobj.read (4096) if notData: breakm.update (data)returnm.hexdigest ()deffull_backup (src_dir, dst_dir, md5file): par_dir, base_dir= Os.path.split (src_dir.rstrip ('/')) Back_name='%s_full_%s.tar.gz'% (base_dir, Time.strftime ('%y%m%d')) full_name=os.path.join (dst_dir, Back_name) md5dict={} Tar= Tarfile.open (full_name,'W:gz') tar.add (src_dir) tar.close () forpath, folders, FilesinchOs.walk (src_dir): forFNameinchFiles:full_path=os.path.join (path, Fname) md5dict[full_path]=Md5check (full_path) with open (md5file,'W') as Fobj:p.dump (md5dict, Fobj)defincr_backup (src_dir, dst_dir, md5file): par_dir, base_dir= Os.path.split (src_dir.rstrip ('/')) Back_name='%s_incr_%s.tar.gz'% (base_dir, Time.strftime ('%y%m%d')) full_name=os.path.join (dst_dir, Back_name) md5new= {} forpath, folders, FilesinchOs.walk (src_dir): forFNameinchFiles:full_path=os.path.join (path, Fname) md5new[full_path]=Md5check (full_path) with open (md5file) as Fobj:md5old=p.load (fobj) with open (md5file,'W') as Fobj:p.dump (md5new, Fobj) tar= Tarfile.open (full_name,'W:gz') forKeyinchMd5new:ifMd5old.get (key)! =md5new[key]: tar.add (key) tar.close ()if __name__=='__main__': Src_dir='/users/xkops/gxb/'Dst_dir='/tmp/'Md5file='/users/xkops/md5.data' ifTime.strftime ('%a') =='Mon': Full_backup (src_dir, dst_dir, Md5file)Else: Incr_backup (src_dir, dst_dir, Md5file)
Run the code, test the effect (before executing, modify the files and paths that need to be backed up), and then check whether or not the backup file of the day is generated Under/tmp.
Python implements full and incremental backup of catalog files