Linux tips: the fastest way to delete 1 million files at a time

Source: Internet
Author: User

Initial Evaluation

Yesterday, I saw a very interesting method to delete massive files under a directory. This method is from Zhenyu Lee in http://www.quora.com/how-can-someone-rapidly-delete-400-000-files.

He does not use find or xargs. He uses the powerful functions of rsync very creatively. He uses rsync-delete to replace the target folder with an empty folder. Then I made an experiment to compare various methods. To my surprise, Lee's approach is much faster than others. The following is my evaluation.

Environment:

  • CPU: Intel (R) Core (TM) 2 Duo CPU E8400 @ 3.00 GHz
  • MEM: 4G
  • HD: ST3250318AS: 250G/7200 RPM

With-delete and-exclude, you can choose to delete files that meet the criteria. Another point is that this method is not suitable when you need to keep this directory for other purposes.

Reevaluate

A few days ago, Keith-Winstein replied to this post on Quora, saying that my previous evaluation could not be copied because the operation lasted for too long. I clarify that the data is too large, probably because my computer has done too much in the past few years, and some file system errors may exist in the evaluation. But I'm not sure why. Now, I got a new computer one day and made another evaluation. This time I used/usr/bin/time to provide more detailed information. The following is the new result.

(Each time it is 1000000 files. The size of each file is 0 .)

Raw output

# method 1~/test $ /usr/bin/time -v  rsync -a --delete empty/ a/        Command being timed: "rsync -a --delete empty/ a/"        User time (seconds): 1.31        System time (seconds): 10.60        Percent of CPU this job got: 95%        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:12.42        Average shared text size (kbytes): 0        Average unshared data size (kbytes): 0        Average stack size (kbytes): 0        Average total size (kbytes): 0        Maximum resident set size (kbytes): 0        Average resident set size (kbytes): 0        Major (requiring I/O) page faults: 0        Minor (reclaiming a frame) page faults: 24378        Voluntary context switches: 106        Involuntary context switches: 22        Swaps: 0        File system inputs: 0        File system outputs: 0        Socket messages sent: 0        Socket messages received: 0        Signals delivered: 0        Page size (bytes): 4096        Exit status: 0# method 2        Command being timed: "find b/ -type f -delete"        User time (seconds): 0.41        System time (seconds): 14.46        Percent of CPU this job got: 52%        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:28.51        Average shared text size (kbytes): 0        Average unshared data size (kbytes): 0        Average stack size (kbytes): 0        Average total size (kbytes): 0        Maximum resident set size (kbytes): 0        Average resident set size (kbytes): 0        Major (requiring I/O) page faults: 0        Minor (reclaiming a frame) page faults: 11749        Voluntary context switches: 14849        Involuntary context switches: 11        Swaps: 0        File system inputs: 0        File system outputs: 0        Socket messages sent: 0        Socket messages received: 0        Signals delivered: 0        Page size (bytes): 4096        Exit status: 0# method 3find c/ -type f | xargs -L 100 rm~/test $ /usr/bin/time -v ./delete.sh        Command being timed: "./delete.sh"        User time (seconds): 2.06        System time (seconds): 20.60        Percent of CPU this job got: 54%        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:41.69        Average shared text size (kbytes): 0        Average unshared data size (kbytes): 0        Average stack size (kbytes): 0        Average total size (kbytes): 0        Maximum resident set size (kbytes): 0        Average resident set size (kbytes): 0        Major (requiring I/O) page faults: 0        Minor (reclaiming a frame) page faults: 1764225        Voluntary context switches: 37048        Involuntary context switches: 15074        Swaps: 0        File system inputs: 0        File system outputs: 0        Socket messages sent: 0        Socket messages received: 0        Signals delivered: 0        Page size (bytes): 4096        Exit status: 0# method 4find d/ -type f | xargs -L 100 -P 100 rm~/test $ /usr/bin/time -v ./delete.sh        Command being timed: "./delete.sh"        User time (seconds): 2.86        System time (seconds): 27.82        Percent of CPU this job got: 89%        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:34.32        Average shared text size (kbytes): 0        Average unshared data size (kbytes): 0        Average stack size (kbytes): 0        Average total size (kbytes): 0        Maximum resident set size (kbytes): 0        Average resident set size (kbytes): 0        Major (requiring I/O) page faults: 0        Minor (reclaiming a frame) page faults: 1764278        Voluntary context switches: 929897        Involuntary context switches: 21720        Swaps: 0        File system inputs: 0        File system outputs: 0        Socket messages sent: 0        Socket messages received: 0        Signals delivered: 0        Page size (bytes): 4096        Exit status: 0# method 5~/test $ /usr/bin/time -v rm -rf f        Command being timed: "rm -rf f"        User time (seconds): 0.20        System time (seconds): 14.80        Percent of CPU this job got: 47%        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:31.29        Average shared text size (kbytes): 0        Average unshared data size (kbytes): 0        Average stack size (kbytes): 0        Average total size (kbytes): 0        Maximum resident set size (kbytes): 0        Average resident set size (kbytes): 0        Major (requiring I/O) page faults: 0        Minor (reclaiming a frame) page faults: 176        Voluntary context switches: 15134        Involuntary context switches: 11        Swaps: 0        File system inputs: 0        File system outputs: 0        Socket messages sent: 0        Socket messages received: 0        Signals delivered: 0        Page size (bytes): 4096        Exit status: 0

I'm really curious why Lee's approach is faster than others, even faster than rm-rf. If anyone knows, please write it below. Thank you very much.

[A faster way to delete millions of files in a directory]

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.