The well-known Schwartzian conversion problem solving in Perl

Last Update:2016-06-10 Source: Internet

Author: User

Tags glob

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The famous Schwartzian conversion in Perl, whose background mainly involves sorting problems:
For example, sort alphabetically by file name, with the following code:
Copy CodeThe code is as follows:

Use strict;
Use warnings;

My @files = Glob "*.xml"; #perl中文件操作符glob提供相当于shell中的通配符的功能
My @sorted_files = sort @files; #sort (), sorted by default alphabetical order

For example, depending on the length of the file name, the code is as follows:
Copy CodeThe code is as follows:

Use strict;
Use warnings;

#length求长度. Spaceship operator &LT;=&GT, the default variable is $ A, $b, the return value of -1,0,1 is greater than, = =, less than. Sort by
My $files = ". xml";
My @sorted_length = sort {Length ($a) <=> length ($b)} @files;

The above two cases, for many file operations, the speed is not too slow, if this is the case.
For example: To batch compare file size, its code is as follows:
Copy CodeThe code is as follows:

Use strict;
Use warnings;

My @files = Glob "*.xml";
My @sort_size = sort {-S $a <=>-s $b} @files; #比较大小

The above code is designed to Sanchong (secondary) operation:
1. Get the file size from the hard disk (-s $b)
2. Compare file size (spaceship operation)
3. Sort it (sort operation)
Consider that to compare $ A, $b size, to get two times from the hard disk, so the number is 6 times! That is, if there are 10,000 files, the total is 60,000 times.
The algorithm complexity is: N*long (n), taking into account the latter two (compare file size, to sort) the inevitable operation, but the first item can be reduced!
That is to read all the file sizes from the hard disk at once, place them into the default variables in Perl and store them in memory! And then the following algorithm is implemented:
Copy CodeThe code is as follows:

Use strict;
Use warnings;

My @files = Glob "*.xml";

My @unsorted_pairs = map {[$_,-S $_]} @files;
My @sorted_pairs = sort {$a->[1] <=> $b->[1]} @unsorted_pairs;
My @sorted_files = map {$_->[0]} @sorted_pairs;

Looks more complex, explained in three steps:
Step one: Iterate through the list of files and create an array reference for each file. An array reference consists of two elements:
The first is the filename ($_), and the second is the file size (-S $_). This way, each file is processed to access only one disk at a time.
Second step: Sort the two-dimensional array. Because the file size is compared, it takes elements [1] to compare their values. Get another two-dimensional array.
Step three: Drop the file size element and create a list with only a filename. Finish the goal!
The above code uses two temporary arrays, but this is not required. We can do all the work with a single statement. In order to achieve this, we need to reverse the order of sentences according to the "Data flow from right to left" principle, and we can still write high-readability code if we put each sentence on a separate line and leave enough space.
Copy CodeThe code is as follows:

My @quickly_sorted_files =
Map {$_->[0]}
Sort {$a->[1] <=> $b->[1]}
Map {[$_,-S $_]}
@files;

This is the Randal L. Schwartz named Schwartzian conversion, the data volume is more than the case, its speed is faster than the former number of times!
Below is a small program, including the generation of 10,000 XML files, in two cases, the complete code is as follows:
Copy CodeThe code is as follows:

#!/usr/bin/perl-w
Use strict;
Use warnings;
Use Autodie;
Use v5.10;

######################################
# # # Create 10,000. xml files to compare # # #
######################################
My $profix = ". xml";

foreach my $num (1..10000) {
Open (My $fh, ' > ', $num. $profix) | | Die "Can not create the file: $!\n";
Print $FH "This is file size testing!";
}

Print "All the 10_1000 files created! \ n ";

######################################
# # # Normal conversion: Traverse 20 times # # #
######################################
My $t 1 = time ();

foreach (1..20) {
My @files = Glob "*.xml";
My @sorted = sort {-S $a <=>-s $b} @files;
}

Say "conventional algorithm takes time: = =", Times ()-$t 1;

######################################
# # # Schwartzian conversion: Traverse 20 times # # #
######################################
My $t 2 = time ();

foreach (1..20) {
My @files = Glob "*.xml";
My @sorted =
Map {$_->[0]}
Sort {$a->[1] <=> $b->[1]}
Map {[$_,-S $_]}
@files;
}

Say "Schwartzian algorithm takes time: = =", Times ()-$t 2;

Output Result:
All the 10_1000 files created!
Conventional algorithms require time: + = 185
Schwartzian algorithm takes time: = 115



This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More