Python sort, sorted advanced sorting tips

Source: Internet
Author: User
Tags python list

The Python list built-in sort () method is used for sorting, or the Python built-in global sorted () method can be used to generate a new sequence for an iterative sequence ordering.

1) Sort Basics

Simple ascending sort is very easy. You only need to call the sorted () method. It returns a new list whose elements are sorted based on the less-than operator (__lt__).

Copy CodeThe code is as follows:
>>> sorted ([5, 2, 3, 1, 4])
[1, 2, 3, 4, 5]



You can also use the List.sort () method to sort, and the list itself will be modified. This method is usually less convenient than sorted (), but this method is more efficient if you do not need to keep the original list.

Copy CodeThe code is as follows:
>>> a = [5, 2, 3, 1, 4]
>>> A.sort ()
>>> A
[1, 2, 3, 4, 5]


The other difference is that the List.sort () method is defined only in the list, whereas the sorted () method is valid for all the iteration sequences.

Copy CodeThe code is as follows:>>>
Sorted ({1: ' D ', 2: ' B ', 3: ' B ', 4: ' E ', 5: ' A '})
[1, 2, 3, 4, 5]

2) key parameter/function

Starting with python2.4, the List.sort () and sorted () functions add a key parameter to specify a function that will be called before each element is compared. For example, the case of a string is ignored by a function specified by key:

Copy CodeThe code is as follows:
>>> sorted ("This was a test string from Andrew". Split (), Key=str.lower)
[' A ', ' Andrew ', ' from ', ' was ', ' string ', ' test ', ' this ']


The value of the key parameter is a function that has only one parameter and returns a value to be used for comparison. This technique is fast because the function specified by key will be called exactly for each element.

A more widespread use is to sort the sequence of complex objects with certain values of complex objects, such as:

Copy CodeThe code is as follows:
>>> Student_tuples = [
(' John ', ' A ', 15),
(' Jane ', ' B ', 12),
(' Dave ', ' B ', 10),
]
>>> Sorted (student_tuples, Key=lambda student:student[2]) # Sort by age
[(' Dave ', ' B ', ten), (' Jane ', ' B ', '), (' John ', ' A ', 15)]

The same technique applies to complex objects that have named properties, such as:

Copy CodeThe code is as follows:
>>> class Student:
def __init__ (self, name, grade, age):
Self.name = Name
Self.grade = Grade
Self.age = Age
def __repr__ (self):
Return Repr ((Self.name, Self.grade, Self.age))
>>> student_objects = [
Student (' John ', ' A ', 15),
Student (' Jane ', ' B ', 12),
Student (' Dave ', ' B ', 10),
]
>>> Sorted (student_objects, Key=lambda student:student.age) # Sort by age
[(' Dave ', ' B ', ten), (' Jane ', ' B ', '), (' John ', ' A ', 15)]

3) Operator module function

The key parameter above is very widely used, so Python provides some handy functions to make the access method easier and faster. The operator module has Itemgetter,attrgetter, and the Methodcaller method has been added since 2.6. Using these methods, the above actions will become more concise and fast:

Copy CodeThe code is as follows:
>>> from operator import Itemgetter, Attrgetter
>>> Sorted (Student_tuples, Key=itemgetter (2))
[(' Dave ', ' B ', ten), (' Jane ', ' B ', '), (' John ', ' A ', 15)]
>>> Sorted (student_objects, Key=attrgetter (' age '))
[(' Dave ', ' B ', ten), (' Jane ', ' B ', '), (' John ', ' A ', 15)]


The operator module also allows for multilevel sorting, for example, first with grade, and then by age:

Copy CodeThe code is as follows:
>>> Sorted (Student_tuples, Key=itemgetter ())
[(' John ', ' A ', '), (' Dave ', ' B ', ten), (' Jane ', ' B ', 12)]
>>> Sorted (student_objects, Key=attrgetter (' Grade ', ' age '))
[(' John ', ' A ', '), (' Dave ', ' B ', ten), (' Jane ', ' B ', 12)]

4) Ascending and descending

Both List.sort () and sorted () accept a parameter of reverse (True or False) to indicate ascending or descending sorting. For example, the above student is sorted in descending order as follows:

Copy CodeThe code is as follows:
>>> Sorted (Student_tuples, Key=itemgetter (2), reverse=true)
[(' John ', ' A ', '), (' Jane ', ' B ', '), (' Dave ', ' B ', 10)]
>>> Sorted (student_objects, Key=attrgetter (' age '), reverse=true)
[(' John ', ' A ', '), (' Jane ', ' B ', '), (' Dave ', ' B ', 10)]

5) Sequencing stability and complex sequencing

Starting from python2.2, the ordering is guaranteed to be stable. This means that if multiple elements have the same key, their order of precedence will be the same before and after ordering.

Copy CodeThe code is as follows:
>>> data = [(' Red ', 1), (' Blue ', 1), (' Red ', 2), (' Blue ', 2)]
>>> sorted (data, Key=itemgetter (0))
[(' Blue ', 1), (' Blue ', 2), (' Red ', 1), (' Red ', 2)]


Note that the order of the ' blue ' is maintained after sorting, that is, ' blue ', 1 in front of ' Blue ', 2.

More complex you can build multiple steps for more complex sorting, such as student data in descending order of grade, and then in ascending order of age.

Copy CodeThe code is as follows:
>>> s = sorted (student_objects, Key=attrgetter (' Age ')) # Sort on secondary key
>>> sorted (S, Key=attrgetter (' Grade '), Reverse=true) # now sort on primary key, descending
[(' Dave ', ' B ', ten), (' Jane ', ' B ', '), (' John ', ' A ', 15)]

6) The most old-fashioned sorting method-DSU

We call it the DSU (decorate-sort-undecorate), because the sequencing process requires the following three steps:
First: The original list is decorated so that the value of the new list can be used to control the sorting;
Second: Sort the list after the decoration;
Third: The decoration is removed, the sorted decoration list is rebuilt to the original type of list;

For example, use the DSU method to sort student data according to grade:
>>> decorated = [(Student.grade, I, student) for I, student in Enumerate (student_objects)]
>>> Decorated.sort ()
>>> [student for grade, I, student in decorated] # undecorate
[(' John ', ' A ', '), (' Jane ', ' B ', '), (' Dave ', ' B ', 10)]
The comparison above is able to work because tuples is a comparison that can be used to compare the first element of the tuples comparison first, if the first one compares the second element, and so on.

Not all cases need to include an index in the above tuples, but the inclusion index can have the following benefits:
First: The ordering is stable, if two elements have the same key, then their original order remains unchanged;
Second: The original element does not have to be used for comparison, because the first and second elements of tuples are already sufficient for comparison.

This method is Randall. After extensive generalization in Perl, his other name is also known as Schwartzian transform.

For large list or list elements that are too complex to calculate, the DSU is probably the quickest sorting method before python2.4. But after 2.4, the key function explained above provides similar functionality.

7) Sorting methods commonly used in other languages-cmp function

Before python2.4, the sorted () and List.sort () functions did not provide a key parameter, but the CMP parameter was provided to allow the user to specify the comparison function. This method is also prevalent in other languages.

In python3.0, the CMP parameters are completely removed to simplify and unify the language, reducing the conflicts between advanced comparisons and __cmp__ methods.

The functions specified by the CMP parameter in python2.x are used for comparison between elements. This function requires 2 parameters, and then returns a negative number indicating less than, 0 means equal to, and a positive number representing greater than. For example:

Copy CodeThe code is as follows:
>>> def numeric_compare (x, y):
return x-y
>>> sorted ([5, 2, 4, 1, 3], Cmp=numeric_compare)
[1, 2, 3, 4, 5]


Or you can sort it in reverse order:

Copy CodeThe code is as follows:
>>> def reverse_numeric (x, y):
Return y-x
>>> sorted ([5, 2, 4, 1, 3], cmp=reverse_numeric)
[5, 4, 3, 2, 1]


When we ported the existing 2.x code to 3.x, we need to convert the CMP function to the key function, and the following wrapper are helpful:

Copy CodeThe code is as follows:
def cmp_to_key (mycmp):
' Convert a cmp= function into a key= function '
Class K (object):
def __init__ (self, obj, *args):
Self.obj = obj
def __lt__ (self, Other):
Return mycmp (Self.obj, other.obj) < 0
def __gt__ (self, Other):
Return mycmp (Self.obj, other.obj) > 0
def __eq__ (self, Other):
Return mycmp (self.obj, other.obj) = = 0
def __le__ (self, Other):
Return mycmp (Self.obj, other.obj) <= 0
def __ge__ (self, Other):
Return mycmp (Self.obj, other.obj) >= 0
def __ne__ (self, Other):
Return mycmp (self.obj, other.obj)! = 0
Return K

When you need to convert a CMP to a key, you only need to:

Copy CodeThe code is as follows:
>>> sorted ([5, 2, 4, 1, 3], Key=cmp_to_key (reverse_numeric))
[5, 4, 3, 2, 1]


The Python2.7,cmp_to_key () function was added to the Functools module.

8) Other Precautions

* You can use LOCALE.STRXFRM () as the key function or use Local.strcoll () as the CMP function when you want to make a region-related sort.

* Reverse parameters still maintain the stability of the sequencing, when interesting, the same effect can use the reversed () function two times to achieve:

Copy CodeThe code is as follows:
>>> data = [(' Red ', 1), (' Blue ', 1), (' Red ', 2), (' Blue ', 2)]
>>> assert sorted (data, reverse=true) = = List (reversed (sorted (data)))

* In fact, the ordering is internally called the __cmp__ of the element, so we can add the __cmp__ method to the element type so that the elements can be compared, for example:

Copy CodeThe code is as follows:
>>> student.__lt__ = lambda self, other:self.age < other.age
>>> Sorted (student_objects)
[(' Dave ', ' B ', ten), (' Jane ', ' B ', '), (' John ', ' A ', 15)]


* The key function can not only access the internal data that needs to sort the elements, but also access external resources, for example, if the student's score is stored in dictionary, you can use this dictionary to sort the list of student names as follows:

Copy CodeThe code is as follows:
>>> students = [' Dave ', ' John ', ' Jane ']
>>> newgrades = {' john ': ' F ', ' Jane ': ' A ', ' Dave ': ' C '}
>>> sorted (students, key=newgrades.__getitem__)
[' Jane ', ' Dave ', ' John ']

* Sort (), sorted () or bisect.insort () are not the best way to do this when you need to process the data. In this case, you can use Heap,red-black tree or treap.

Python sort, sorted advanced sorting tips

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.