Implementation of the famous Schwartzian conversion problem in Perl

Source: Internet
Author: User
Tags glob
This article mainly introduces how to solve the famous Schwartzian conversion problem in Perl. This article describes the ordering problem involved in the Schwartzian conversion and provides the implementation code, if you need a conversion, refer to the famous Schwartzian conversion in Perl. The background mainly involves sorting:
For example, the Code is as follows:

The Code is as follows:


Use strict;
Use warnings;

My @ files = glob "*. xml"; # The file operator glob in perl provides the equivalent functionality of wildcards in shell.
My @ sorted_files = sort @ files; # sort (), which is sorted alphabetically by default.


For example, the Code for sorting by file name length is as follows:

The Code is as follows:


Use strict;
Use warnings;

# Length: length. The space ship operator <=>. The default variables are $ a and $ B. The return values are-. 1 indicates greater than, =, and less. Sort by sort
My $ files = ". xml ";
My @ sorted_length = sort {length ($ a) <=> length ($ B)} @ files;


In the above two cases, the speed is not slow for many file operations, if the following is the case.
For example, to compare the file size in batches, the Code is as follows:

The Code is as follows:


Use strict;
Use warnings;

My @ files = glob "*. xml ";
My @ sort_size = sort {-s $ a <=>-s $ B} @ files; # compare the size


The above Code is designed to triplicate operations:
1. Get the file size from the hard disk (-s $ B)
2. Compare file size (space ship operation)
3. sort it (sort Operation)
Considering that we need to compare $ a and $ B, we need to get it twice from the hard disk, so the number of times is 6! That is to say, if there are 10 thousand files, the total number is 60 thousand.
The algorithm complexity is: n * long (n). Considering the operations required for the last two items (comparing the file size and sorting), the first item can be reduced!
That is, read all the file sizes from the hard disk at one time, place them to the default variables in Perl, and store them in memory! The following algorithm is implemented:

The Code is as follows:


Use strict;
Use warnings;

My @ files = glob "*. xml ";

My @ unsorted_pairs = map {[$ _,-s $ _]} @ files;
My @ sorted_pairs = sort {$ a-> [1] <=> $ B-> [1]} @ unsorted_pairs;
My @ sorted_files = map {$ _-> [0]} @ sorted_pairs;


It looks complicated. It can be explained in three steps:
Step 1: traverse the file list and create an array reference for each file. Array reference contains two elements:
The first is the file name ($ _), and the second is the file size (-s $ _). In this way, each file can be accessed only once.
Step 2: sort two-dimensional arrays. Because the file size is compared, you need to take the element [1] and compare their values. Get another two-dimensional array.
Step 3: discard the file size element and create a list containing only file names. Complete the target!
The above Code uses two temporary arrays, but this is not necessary. We can complete all the work with one statement. To achieve this goal, we need to follow the principle of "data flows from right to left" to reverse the sentence order. If each sentence is placed in a separate line, leave enough space, we can still write highly readable code.

The Code is as follows:


My @ quickly_sorted_files =
Map {$ _-> [0]}
Sort {$ a-> [1] <=> $ B-> [1]}
Map {[$ _,-s $ _]}
@ Files;


This is the Schwartzian conversion named after Randal L. Schwartz. It is several times faster than the former when there is a large amount of data!
The following is a small program, including generating 10 thousand xml files. In both cases, the complete code is as follows:

The Code is as follows:


#! /Usr/bin/perl-w
Use strict;
Use warnings;
Use autodie;
Use v5.10;

######################################
### Create the 10,000. xml files to be compared ###
######################################
My $ profix = ". xml ";

Foreach my $ num (1 .. 10000 ){
Open (my $ fh, '>', $ num. $ profix) | die "Can not create the file: $! \ N ";
Print $ fh "This is file size testing! ";
}

Print "All the 10_1000 files created! \ N ";


######################################
### Conventional Conversion: traversing 20 times ###
######################################
My $ t1 = time ();

Foreach (1 .. 20 ){
My @ files = glob "*. xml ";
My @ sorted = sort {-s $ a <=>-s $ B} @ files;
}

Say "regular algorithms take time: =>", time ()-$ t1;


######################################
### Schwartzian conversion: traversing 20 times ###
######################################
My $ t2 = time ();

Foreach (1 .. 20 ){
My @ files = glob "*. xml ";
My @ sorted =
Map {$ _-> [0]}
Sort {$ a-> [1] <=> $ B-> [1]}
Map {[$ _,-s $ _]}
@ Files;
}

Say "The Schwartzian algorithm takes time: =>", time ()-$ t2;

Output result:
All the 10_1000 files created!
Regular algorithms take time: => 185
The Schwartzian algorithm takes time: => 115.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.