This article mainly introduces Perl famous Schwartzian conversion problem solving implementation, this article explained the Schwartzian transformation involves the sorting question, and also gives the realization code, needs the friend to be possible to refer to under
The famous Schwartzian transformations in Perl, whose background is mainly related to sorting problems:
For example, the code is sorted alphabetically by file name, as follows:
The code is as follows:
Use strict;
Use warnings;
My @files = Glob "*.xml"; The file operator glob in #perl provides the equivalent of a wildcard in the shell.
My @sorted_files = sort @files; #sort (), sort, by default alphabetical order
For example, depending on the length of the file name, the code is as follows:
The code is as follows:
Use strict;
Use warnings;
# Spacecraft operator <=>, the default variable is $a, $b, the return value is -1,0,1, respectively, is greater than, = =, less than. Sort for sorting
My $files = ". xml";
My @sorted_length = sort {Length ($a) <=> length ($b)} @files;
The above two cases, for many file operations, the speed is not slow, if it is the case below.
For example: To bulk compare file size, the code is as follows:
The code is as follows:
Use strict;
Use warnings;
My @files = Glob "*.xml";
My @sort_size = sort {s $a <=>-s $b} @files; # compare size
The code above is designed to Sanchong (secondary) operations:
1. Get the file size from the hard disk (-s $b)
2. Compare file size (spacecraft operation)
3. Sort it (sort operation)
Considering to compare $a, $b size, to get two times from the hard drive, so the number is 6 times! That is, if there are 10,000 files, it is 60,000 times.
Its algorithm complexity is: N*long (n), taking into account the latter two (compare file size, to sort) the inevitable operation, but the first item can be reduced!
That is, to read all the file sizes from the hard disk at once, place them in the default variables in Perl and store them in memory! The following algorithm is implemented:
The code is as follows:
Use strict;
Use warnings;
My @files = Glob "*.xml";
My @unsorted_pairs = map {[$_,-S $_]} @files;
My @sorted_pairs = sort {$a->[1] <=> $b->[1]} @unsorted_pairs;
My @sorted_files = map {$_->[0]} @sorted_pairs;
It looks more complicated, explained in three steps:
Step one: Iterate through the list of files and create an array reference for each file. An array reference contains two elements:
The first is the file name ($_) and the second is the file size (-S $_). This way, processing each file accesses only one disk at a time.
Step two: Sort the two-dimensional array. By comparing the file size, you need to take elements [1] to compare their values. Get another two-dimensional array.
Step three: Discard the file size element and create a list containing only the file name. Finish the goal!
The above code uses two temporary arrays, but this is not necessary. We can do all the work in one sentence. To achieve this, you need to reverse the order of the sentences according to the "data from right to left" principle, and if you put each sentence on a single line and leave enough space, we can still write the readable code.
The code is as follows:
My @quickly_sorted_files =
Map {$_->[0]}
Sort {$a->[1] <=> $b->[1]}
Map {[$_, S $_]}
@files;
This is the Schwartzian conversion named Randal L. Schwartz, which is a lot faster than the former for a very large amount of data!
The following is a small program that includes generating 10,000 XML files, in two cases the complete code is as follows:
The code is as follows:
#!/usr/bin/perl-w
Use strict;
Use warnings;
Use Autodie;
Use v5.10;
######################################
### create 10,000. xml files to compare ###
######################################
My $profix = ". xml";
foreach my $num (1..10000) {
Open (My $fh, ' > ', $num. $profix) | | Die "Can not create the file: $!n";
Print $FH "This is file size testing!";
}
Print "All" 10_1000 files created! n ";
######################################
### General conversion: traverse 20 times ###
######################################
My $t 1 = time ();
foreach (1..20) {
My @files = Glob "*.xml";
My @sorted = sort {s $a <=>-s $b} @files;
}
Say "Conventional algorithms take time: =>", Times ()-$t 1;
######################################
### Schwartzian conversion: traversing 20 times ###
######################################
My $t 2 = time ();
foreach (1..20) {
My @files = Glob "*.xml";
My @sorted =
Map {$_->[0]}
Sort {$a->[1] <=> $b->[1]}
Map {[$_, S $_]}
@files;
}
Say "Schwartzian algorithm takes time: =>", Times ()-$t 2;
Output results:
All the 10_1000 files created!
General algorithm requires time: => 185
Schwartzian algorithm takes time: => 115