This post compares the three methods of folder traversal.
1. Use File: Find;
2. Recursive traversal. (The traversal function is LSR)
3. Use Queue or stack traversal. (The traversal function is lsr_s)
1. Use File: Find
[Copy to clipboard] [-]
Code:
#! /Usr/bin/perl-W
#
# File: Find. pl
# Author: Lu Xiaojia
# License: GPL-2
Use strict;
Use warnings;
Use File: Find;
My ($ size, $ dircnt, $ filecnt) = (0, 0, 0 );
Sub process {
My $ file = $ file: Find: Name;
# Print $ file, "/N ";
If (-d $ file ){
$ Dircnt ++;
}
Else {
$ Filecnt ++;
$ Size + =-S $ file;
}
}
Find (/& process ,'.');
Print "$ filecnt files, $ dircnt directory. $ size bytes./N ";
2. Recursive traversal of LSR
[Copy to clipboard] [-]
Code:
#! /Usr/bin/perl-W
#
# File: LSR. pl
# Author: Lu Xiaojia
# License: GPL-2
Use strict;
Use warnings;
Sub LSR ($ ){
Sub LSR;
My $ CWD = shift;
Local * DH;
If (! Opendir (DH, $ CWD )){
Warn "cannot opendir $ CWD: $! $ ^ E ";
Return UNDEF;
}
Foreach (readdir (DH )){
If ($ _ eq '.' | $ _ eq '..'){
Next;
}
My $ file = $ CWD. '/'. $ _;
If (! -L $ file &-D _){
$ File. = '/';
LSR ($ file );
}
Process ($ file, $ CWD );
}
Closedir (DH );
}
My ($ size, $ dircnt, $ filecnt) = (0, 0, 0 );
Sub process ($ ){
My $ file = shift;
# Print $ file, "/N ";
If (substr ($ file, length ($ file)-1, 1) eq '/'){
$ Dircnt ++;
}
Else {
$ Filecnt ++;
$ Size + =-S $ file;
}
}
LSR ('.');
Print "$ filecnt files, $ dircnt directory. $ size bytes./N ";
3. lsr_s stack Traversal
[Copy to clipboard] [-]
Code:
#! /Usr/bin/perl-W
#
# File: lsr_s.pl
# Author: Lu Xiaojia
# License: GPL-2
Use strict;
Use warnings;
Sub lsr_s ($ ){
My $ CWD = shift;
My @ dirs = ($ CWD .'/');
My ($ Dir, $ file );
While ($ dir = POP (@ dirs )){
Local * DH;
If (! Opendir (DH, $ DIR )){
Warn "cannot opendir $ dir: $! $ ^ E ";
Next;
}
Foreach (readdir (DH )){
If ($ _ eq '.' | $ _ eq '..'){
Next;
}
$ File = $ dir. $ _;
If (! -L $ file &-D _){
$ File. = '/';
Push (@ dirs, $ file );
}
Process ($ file, $ DIR );
}
Closedir (DH );
}
}
My ($ size, $ dircnt, $ filecnt) = (0, 0, 0 );
Sub process ($ ){
My $ file = shift;
Print $ file, "/N ";
If (substr ($ file, length ($ file)-1, 1) eq '/'){
$ Dircnt ++;
}
Else {
$ Filecnt ++;
$ Size + =-S $ file;
}
}
Lsr_s ('.');
Print "$ filecnt files, $ dircnt directory. $ size bytes./N ";
Test results for my hard drive/dev/hda6.
1: file: Find
[Copy to clipboard] [-]
Code:
26881 files, 1603 directory. 9052479946 bytes.
Real 0m9. 140 s
User 0m3. 124 S
Sys 0m5. 811 s
2: LSR
[Copy to clipboard] [-]
Code:
26881 files, 1603 directory. 9052479946 bytes.
Real 0m8. 266 s
User 0m2. 686 s
Sys 0m5. 405 s
3: lsr_s
[Copy to clipboard] [-]
Code:
26881 files, 1603 directory. 9052479946 bytes.
Real 0m6. 532 s
User 0m2. 124 S
Sys 0m3. 952 s
During the test, the cache must be tested several times to obtain the average value, and the file name should not be printed at the same time. Because the console is a slow device, a bottleneck may occur.
The reason why lsr_s uses stacks rather than queues for traversing is that Perl's push shift pop operation is based on arrays, and push pop operations in pairs may be optimized. The order of memory and CPU usage is also 1> 2> 3.
[Copy to clipboard] [-]
Code:
CPU load memory
Use File: Find 97% 4540 K
LSR 95% 3760 K
Lsr_s 95% 3590 K
Conclusion: we strongly recommend that you use lsr_s to traverse folders.
==============================================================
In terms of execution efficiency, find. PL than LSR. the main difference between PL is the user, because the file: Find module has many options and the condition judgment takes a lot of time, while lsr_s.pl is more efficient than LSR. PL is rarely used for system calls because the program is still saving the original file handle and function recovery information during recursion, so sys takes a lot of time. Therefore, it is unreasonable for lsr_s to win both sys and user.