Perl sort function Usage Summary and use examples _ Application tips

Source: Internet
Author: User
Tags anonymous arrays hash locale

A) Sort function usage

Sort LIST
Sort Block LIST
Sort SubName LIST

The use of sort is in the form of 3 different forms. It sorts the list and returns the sorted lists. If SubName or Block,sort are omitted in the standard string comparison order (for example, ASCII order). If SubName is specified, it is actually the name of a child function that compares 2 list elements and returns an integer less than, equal to, or greater than 0, depending on the order in which the elements are sorted (ascending, identity, or descending). You can also provide a block as an anonymous child function to replace subname, the effect is the same.

The 2 elements that are compared are temporarily assigned to variable $a and $b. They are passed by reference, so do not modify $a or $b. If you use a child function, it cannot be a recursive function.

Ii. Examples of usage

1. Sort in numerical order

Copy Code code as follows:

@array = (8, 2, 32, 1, 4, 16);
Print join (', sort {$a <=> $b} @array), ' \ n ';

Print results are:
Copy Code code as follows:
1 2 4 8 16 32

And the same is:

Copy Code code as follows:
Sub numerically {$a <=> $b};
Print join (', sort numerically @array), ' \ n ';

This is easy to understand oh, it is just in the order of the natural number of sort, I do not detail.

2.1 In ASCII order (non-dictionary order) sort

Copy Code code as follows:

@languages = QW (Fortran lisp c C + + Perl python java);
Print join (', sort @languages), ' \ n ';

Print results:

Copy Code code as follows:
Perl C + + Fortran Java Lisp python

This equates to:

Copy Code code as follows:
Print join (', sort {$a CMP $b} @languages), ' \ n ';

Sort in order of ASCII, and nothing to say oh.

Note that if you sort the numbers in ASCII order, the results may differ from what you think:

Copy Code code as follows:

Print join (", sort 1..)," \ n ";
1 10 11 2 3 4 5 6 7 8 9

2.2 In dictionary order sort

Copy Code code as follows:

Use locale;
@array = QW (ASCII ASCAP at_large atlarge A arp ARP);
@sorted = sort {($da = LC $a) =~ s/[/w_]+//g;
($DB = LC $b) =~ s/[/w_]+//g;
$da CMP $db;
} @array;
print "@sorted \ n";

Print results are:

Copy Code code as follows:
A arp arp ASCAP ASCII atlarge at_large

Use locale is optional-it makes code compatibility better if the original data contains international characters. Use locale affects the operation properties of CMP,LT,LE,GE,GT and some other functions-more details see Perllocale's man page.

Note that the order of Atlarge and At_large is reversed in the output, although their sort order is the same (the child function in the sort center deletes the underscore in the middle of the At_large). This happens because the example runs on the Perl 5.005_02. Before Perl version 5.6, the sort function did not protect the order of keys with the same values. Perl version 5.6 and higher will protect this order.

Note that, whether it's map,grep or sort, protect the value of this temporary variable $_ ($a and $b in sort), and don't modify it.
In this code, before replacing the $a or $b s/[/w_]+//g, they are assigned to $da and $db, so the substitution operation does not modify the original element Oh.

3. Sort in descending order

Descending sort is simpler, and you can change the number of operands before or after CMP or <=>.

Copy Code code as follows:
Sort {$b <=> $a} @array;

or change the token of the return value of the middle block or child function:
Copy Code code as follows:
Sort {-($a <=> $b)} @array;

Or use the reverse function (which is a bit inefficient but perhaps easy to read):
Copy Code code as follows:
Reverse sort {$a <=> $b} @array;

4. Use multiple keys for sort

To sort with multiple keys, place all comparison operations that are connected with or, in a child function. Put the main comparison operations in the front, and the secondary ones in the back.

Copy Code code as follows:

# An array of references to anonymous hashes
@employees = (
{=> ' Bill ', last => ' Gates ',
SALARY => 600000, age => 45},
{=> ' George ', last => ' Tester '
SALARY => 55000, age => 29},
{=> ' Steve ', last => ' Ballmer ',
SALARY => 600000, age => 41}
{The => ' Sally ', last => ' Developer ',
SALARY => 55000, age => 29},
{=> ' Joe ', last => ' Tester ',
SALARY => 55000, age => 29},
);
Sub seniority {
$b->{salary} <=> $a->{salary}
or $b->{age} <=> $a->{age}
or $a->{last} cmp $b->{last}
or $a->{first} cmp $b->{first}
}
@ranked = sort seniority @employees;
foreach $emp (@ranked) {
Print "$emp->{salary}/t$emp->{age}/t$emp->{first}
$emp->{last}\n ";
}

Print results are:

Copy Code code as follows:
600000 Bill Gates
600000 Steve Ballmer
55000 Sally Developer
55000 George Tester
55000 Joe Tester

The code looks very complicated and is actually easy to understand. An element of the @employees array is an anonymous hash. An anonymous hash is actually a reference that can be accessed by using the-> operator, for example, $employees[0]->{salary} can access the value of the SALARY corresponding to the first anonymous hash. So the above comparison is very clear, first compare the value of the salary, and then compare the value of the age, and then compare the last value, the final comparison of the value of a. Note that the first 2 comparisons are in descending order, and the last 2 items are ascending, don't confuse them.

5. Sort out the new array

Copy Code code as follows:

@x = QW (Matt Elroy Jane Sally);
@rank [Sort {$x [$a] CMP $x [$b]} 0. $ #x] = 0. $ #x;
print "@rank \ n";

Print results are:

Copy Code code as follows:
2 0 1 3

Is it a bit confusing here? It's clear if you look carefully. 0.. $ #x是个列表, its value is the subscript of the @x array, which is 0 1 2 3. $x [$a] CMP $x [$b] is to compare the various elements in the @x in ASCII order. So the sort result returns a list of the @x's subscripts, sorted by the ASCII order of the @x elements corresponding to the subscript.
Still don't understand what sort returns? Let's first print out the ASCII order of the elements in the @x:

Copy Code code as follows:

@x = QW (Matt Elroy Jane Sally);
Print join ', sort {$a CMP $b} @x;

Print results are:

Copy Code code as follows:
Elroy Jane Matt Sally

Their corresponding subscript in @x is 1 2 0 3, so the result of the sort return is the list of 1 2 0 3. @rank [1 2 0 3] = 0. $ #x is just a simple array assignment operation
So the result of @rank is (2 0 1 3).

6. Click the keys to sort the hash

Copy Code code as follows:

%hash = (Donald => Knuth, Alan => Turing, John => Neumann);
@sorted = map {{($_ => $hash {$_})}} sort keys%hash;
foreach $hashref (@sorted) {
($key, $value) = each% $hashref;
Print "$key => $value \ n";
}

Print results are:

Copy Code code as follows:
Alan => Turing
Donald => Knuth
John => Neumann

This code is not difficult to understand OH. The sort keys%hash returns a list in the ASCII order of the%hash keys, and then computes with a map, noting that the map uses a double {{}}
Inside of {} is an anonymous hash oh, which means that the result of map is an anonymous hash list, understand?
So the elements in the @sorted array are anonymous hashes, which are cross-referenced by the% $hashref to access their key/value values.

7. Sort the hash by values

Copy Code code as follows:

%hash = (Elliot => Babbage,
Charles => Babbage,
Grace => Hopper,
Herman => Hollerith
);
@sorted = map {{($_ => $hash {$_})}}
Sort {$hash {$a} cmp $hash {$b}
or $a CMP $b
Keys%hash;
foreach $hashref (@sorted) {
($key, $value) = each% $hashref;
Print "$key => $value \ n";
}

Print results are:

Copy Code code as follows:
Charles => Babbage
Elliot => Babbage
Herman => Hollerith
Grace => Hopper

Unlike the hash keys, we cannot guarantee the uniqueness of the hash values. If you sort the hash according to values only, then when you add or delete other values, the sort order of the 2 elements with the same value may change. In order to obtain stable results, the value should be the main sort, and the key should be from sort.

Here {$hash {$a} cmp $hash {$b} or $a cmp $b} 2 times by value and then by Key. Oh, sort returns a sorted list of keys, and then the list is then presented to map for calculation, returning an anonymous hash list. The access method is the same as before, and I am not.

8. Sort the words in the file and remove the duplicate

Copy Code code as follows:

Perl-0777ane ' $, = "\ n"; @uniq {@F} = (); Print sort keys%uniq ' file

Let's try this one, and I'm not quite clear about it.
@uniq {@F} = () uses a hash slice to create a hash whose keys are the only words in the file;
The usage is semantically equivalent to the $uniq{$F [0], $F [1], ... $F [$ #F]} = ()

The options are described below:

Copy Code code as follows:
-0777-Read the entire file instead of a single
-A-auto split mode, dividing rows into @f arrays
-e-Read and run scripts from the command line
-N-traverses the file line by row: while (<>) {...}
$,-output field separator for print function
FILE-filename

9. Efficient Sorting:orcish algorithm and Schwartzian conversion

The child functions for each key,sort are often called multiple times. If you are very concerned about the sort run time, you can use the orcish algorithm or the Schwartzian conversion so that each key is counted only 1 times
Consider the following example, which is the sort file list based on the date the file was modified.

Copy Code code as follows:
# Forced algorithm-Multiple access to disk for each file
@sorted = sort {m $a <=> m $b} @filenames;

# Orcish algorithm--Create keys in hash
@sorted = sort {($modtimes {$a} | | =-M $a) <=>
($modtimes {$b} | | =-M $b)
} @filenames;


Ingenious algorithm, isn't it? Because the file modification date is basically unchanged during the script operation, the-m operation once, save it.
The following is the use of the Schwartzian conversion:

Copy Code code as follows:
@sorted = Map ({$_->[0]}
Sort ({$a->[1] <=> $b->[1]}
Map ({[$_,-M]} @filenames)
)
);

This code combined with a map,sort several layers, remember the method I have mentioned before, look forward. Map ({[$_, M]} @filenames) returns a list of anonymous arrays in which the first value of the anonymous array is the file name and the second value is the modified date of the file.

Sort ({$a->[1] <=> $b->[1]} ... Then sort the list of anonymous arrays that are generated above, which is sort based on the modified date of the file
The result of sort returns is the sorted anonymous array.

The outermost map ({$_->[0]} ... Simply, it extracts the filename from the anonymous array generated by the sort above. This file name is based on the modified date of the sort of ah, and each file only run once-M.
This is the famous Schwartzian conversion, which is popular with Perl users abroad.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.