Python implements full and incremental backup of catalog files

Source: Internet
Author: User

Goal:

1. Pass in 3 parameters: source file path, destination file path, MD5 file

2. Every Monday to achieve full-scale backup, the rest of the time incremental backup

1. Get all directories and files under the path (recursive) through the path passed in

Method One: Use Os.listdir

The code is as Follows:

#!/usr/bin/env python#Coding:utf8ImportOs,sysdefLsdir (folder): Contents=Os.listdir (folder)Print "%s\n%s\n"%(folder, contents) forPathinchContents:full_path=os.path.join (folder, Path)ifos.path.isdir (full_path): lsdir (full_path)if __name__=="__main__": Lsdir (sys.argv[1])

? Run the code with the following effect:

[[email protected] python] # python listdir.py/a/a['b'a.txt'] /a/b['C'b.txt']/ a/b/c['c.txt']

Method Two: Use Os.walk

The code is as Follows:

#!/usr/bin/env python#-*-coding:utf-8-*-ImportOs,sysdefLsdir (folder): Contents=Os.walk (folder) forpath, folder, Fileinchcontents:Print "%s\n%s\n"% (path, folder +File)if __name__=="__main__": Lsdir (sys.argv[1])

? run code, Test effect

[[email protected] python] # python listdir1.py/a/a['b'a.txt'] /a/b['C'b.txt']/ a/b/c['c.txt']

2. How to calculate the MD5 value of the file (4K per read, until all the contents of the file are read, return a 16 binary MD5 Value)

The code is as Follows:

[email protected] python]# Cat md5.py

#!/usr/bin/env python#-*-coding:utf-8-*-ImportHashlibImportSYSdefMD5 (fname): m=hashlib.md5 () with open (fname) as Fobj: whileTrue:data= Fobj.read (4096)            if  notData: breakm.update (data)returnm.hexdigest ()if __name__=="__main__":    PrintMD5 (sys.argv[1])

? run code, Test effect

[[email protected] python] # python md5.py a.txt c33da92372e700f98b006dfa5325cf0d[[email protected] python] # md5sum a.txtc33da92372e700f98b006dfa5325cf0d  a.txt

* Hint: Use the Linux md5sum to communicate with your own python-calculated MD5 values

3. Write full-volume and incremental backup scripts

The code is as Follows:

#!/usr/bin/env python#Coding:utf8Import timeImportOSImportTarfileImportCpickle as PImportHashlibdefMd5check (fname): m=hashlib.md5 () with open (fname) as Fobj: whileTrue:data= Fobj.read (4096)            if  notData: breakm.update (data)returnm.hexdigest ()deffull_backup (src_dir, dst_dir, md5file): par_dir, base_dir= Os.path.split (src_dir.rstrip ('/')) Back_name='%s_full_%s.tar.gz'% (base_dir, Time.strftime ('%y%m%d')) full_name=os.path.join (dst_dir, Back_name) md5dict={} Tar= Tarfile.open (full_name,'W:gz') tar.add (src_dir) tar.close () forpath, folders, FilesinchOs.walk (src_dir): forFNameinchFiles:full_path=os.path.join (path, Fname) md5dict[full_path]=Md5check (full_path) with open (md5file,'W') as Fobj:p.dump (md5dict, Fobj)defincr_backup (src_dir, dst_dir, md5file): par_dir, base_dir= Os.path.split (src_dir.rstrip ('/')) Back_name='%s_incr_%s.tar.gz'% (base_dir, Time.strftime ('%y%m%d')) full_name=os.path.join (dst_dir, Back_name) md5new= {}     forpath, folders, FilesinchOs.walk (src_dir): forFNameinchFiles:full_path=os.path.join (path, Fname) md5new[full_path]=Md5check (full_path) with open (md5file) as Fobj:md5old=p.load (fobj) with open (md5file,'W') as Fobj:p.dump (md5new, Fobj) tar= Tarfile.open (full_name,'W:gz')     forKeyinchMd5new:ifMd5old.get (key)! =md5new[key]: tar.add (key) tar.close ()if __name__=='__main__': Src_dir='/users/xkops/gxb/'Dst_dir='/tmp/'Md5file='/users/xkops/md5.data'    ifTime.strftime ('%a') =='Mon': Full_backup (src_dir, dst_dir, Md5file)Else: Incr_backup (src_dir, dst_dir, Md5file)

Run the code, test the effect (before executing, modify the files and paths that need to be backed up), and then check whether or not the backup file of the day is generated Under/tmp.

Python implements full and incremental backup of catalog files

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.