The server produces a global backup file once a week, about 100G in size, and needs to be cleaned regularly.
Working time site access is large, server I/O high when the deletion of large data will have a bad impact on the server state. So I want to use the scheduled task to automate.
Under my backup directory/bakcup, each backup file names the directory names in the form of a date:
The code is as follows:
# ls
2013-12-23 2014-01-06 2014-01-20 2014-02-03
2013-12-30 2014-01-13 2014-01-27 2014-02-10
To delete a partial backup while preserving the part, you can use the Find command, such as I want to keep the files backed up in the last four weeks, every backup interval of seven days:
The code is as follows:
# find/bakcup/-maxdepth 1-type d-mtime +28
/bakcup/2014-01-06
/bakcup/2014-01-13
/bakcup/2013-12-23
/bakcup/2013-12-30
-maxdepth 1: Set lookup directory Depth of 1, only found in the/backup directory, if not with this parameter will be listed in the Subordinate directory files
-type d: Set Lookup type as directory
-mtime + 28: Find a 28 day ago directory
Use-exec parameter to connect Delete command after lookup ends
The code is as follows:
Rsync--delete-before-d/data/test/{};
So, the whole order is:
The code is as follows:
# find/bakcup/-maxdepth 1-type d-mtime +28-exec rsync--delete-before-d/data/test/{};
< p>
Finally you can put the command into the script, set crontab automatic execution.
Remind:
Before using the command, you should try the find part of the command on the server, and if you only find the directory you want to clean up, you can continue.
Do not rule out that some systems will find the./directory, be sure to see clearly, to prevent the occurrence of accidents.
In addition,-exec can be replaced with-OK, the effect is the same, before the deletion to remind users to confirm.
Comparison of efficiency between PS:RM command and rsync command
Rm
The RM command calls Lstat64 and unlink in large numbers, and you can speculate that a lstat operation was made from the file system before deleting each file.
The number of Lstat64 is lower than the total number of files, and there are other reasons, which will then be explained in another article.
Getdirentries64 This call is more critical.
Process: The first stage of the formal deletion work, the need for the GETDIRENTRIES64 call, batch read the directory (about 4K each time), in the memory of the establishment of RM file list; In the second phase, the Lstat64 determines the status of all files, and the third phase performs the actual deletion through unlink. There are more system calls and file system operations in these three phases.
Rsync
Rsync makes little system calls.
No Lstat and unlink operations are done for a single file.
In the early stages of command execution, Rsync opens a shared memory that loads directory information in a mmap way.
Directory synchronization only, do not need to do unlink for a single file.
In addition, in other people's evaluation, RM more context switching, will cause the system CPU consumption-for file system operation, simple increase the number of concurrent numbers does not always improve the operating speed.