The server produces a global backup file once a week, about 100G in size, and needs to be cleaned regularly.
Working time site access is large, server I/O high when the deletion of large data will have a bad impact on the server state. So I want to use the scheduled task to automate.
Under my backup directory/bakcup, each backup file names the directory names in the form of a date:
2013-12-23 2014-01-06 2014-01-20 2014-02-03
2013-12-30 2014-01-13 2014-01-27 2014-02-10
To delete a partial backup while preserving the part, you can use the Find command, such as I want to keep the files backed up in the last four weeks, every backup interval of seven days:
# find/bakcup/-maxdepth 1-type d-mtime +28
/bakcup/2014-01-06
/bakcup/2014-01-13
/bakcup/2013-12-23
/bakcup/2013-12-30
-maxdepth 1: Set lookup directory Depth of 1, only found in the/backup directory, if not with this parameter will be listed in the Subordinate directory files
-type d: Set Lookup type as directory
-mtime + 28: Find a 28 day ago directory
Use-exec parameter to connect Delete command after lookup ends
Rsync--delete-before-d/data/test/{} \;
So, the whole order is:
The code is as follows:
# find/bakcup/-maxdepth 1-type d-mtime +28-exec rsync--delete-before-d/data/test/{} \;
Finally you can put the command into the script, set crontab automatic execution.
Remind:
Before using the command, you should try the find part of the command on the server, and if you only find the directory you want to clean up, you can continue.
Do not rule out that some systems will find the./directory, be sure to see clearly, to prevent the occurrence of accidents.
In addition,-exec can be replaced with-OK, the effect is the same, before the deletion to remind users to confirm.
The efficiency comparison of the PS:RM command with the rsync command
the RM
RM command calls Lstat64 and unlink in large numbers, and it can be speculated that a lstat operation was made from the file system before each file was deleted. The
Lstat64 is less than the total number of files, and for another reason, it is later described in another article.
Getdirentries64 This call is more critical.
Procedure: The first phase of the formal deletion process, which requires a getdirentries64 call, a batch read directory (approximately 4K each time), and a list of RM files in memory; the second stage, the Lstat64 determines the status of all files; The actual deletion is performed through unlink. There are more system calls and file system operations in these three phases.
rsync
Rsync makes little system calls. The
does not do lstat and unlink operations against individual files.
Prior to the command execution, Rsync opened a shared memory and loaded the directory information mmap.
Directory synchronization only, do not need to do unlink for a single file.
In addition, in other people's evaluation, RM is more context-switching, resulting in more system CPU consumption-for file system operations, simple increase in concurrency does not always improve the speed of operation.