Python sort, sorted advanced sorting Tips _python

Source: Internet
Author: User
Tags python list

The Python list built-in sort () method is used to sort, or you can use the Python built-in Global sorted () method to generate a new sequence for an iterative sequence ordering.

1) Sorting Basics

The simple ascending sort is very easy. You only need to invoke the sorted () method. It returns a new list, and the elements of the new list are sorted based on the less-than operator (__lt__).

Copy Code code as follows:

>>> sorted ([5, 2, 3, 1, 4])
[1, 2, 3, 4, 5]


You can also sort by using the List.sort () method, at which point the list itself will be modified. This method is usually less convenient than sorted (), but this method is more efficient if you do not need to keep the original list.
Copy Code code as follows:

>>> a = [5, 2, 3, 1, 4]
>>> A.sort ()
>>> A
[1, 2, 3, 4, 5]

Another difference is that the List.sort () method is defined only in the list, whereas the sorted () method is valid for all iteration sequences.
Copy Code code as follows:
>>>
Sorted ({1: ' D ', 2: ' B ', 3: ' B ', 4: ' E ', 5: ' A '})
[1, 2, 3, 4, 5]

2 Key parameter/function

Starting with python2.4, the List.sort () and sorted () functions add a key parameter to specify a function that will be invoked before each element is compared. For example, a function specified by key ignores the case of a string:

Copy Code code as follows:

>>> sorted ("This is a test string from Andrew". Split (), Key=str.lower)
[' A ', ' Andrew ', ' from ', ' was ', ' string ', ' test ', ' this ']

The value of the key parameter is a function that has only one parameter and returns a value for comparison. This technique is fast because the function specified by the key will be accurately invoked on each element.

A broader use scenario is to sort the sequence of complex objects with certain values of complex objects, for example:

Copy Code code as follows:

>>> Student_tuples = [
(' John ', ' A ', 15),
(' Jane ', ' B ', 12),
(' Dave ', ' B ', 10),
]
>>> Sorted (student_tuples, Key=lambda student:student[2]) # Sort by age
[(' Dave ', ' B ',], (' Jane ', ' B ', '), (' John ', ' A ', 15)]

The same technique also applies to complex objects that have named attributes, such as:

Copy Code code as follows:

>>> class Student:
def __init__ (self, name, grade, age):
Self.name = Name
Self.grade = Grade
Self.age = Age
def __repr__ (self):
Return Repr ((Self.name, Self.grade, Self.age))
>>> student_objects = [
Student (' John ', ' A ', 15),
Student (' Jane ', ' B ', 12),
Student (' Dave ', ' B ', 10),
]
>>> Sorted (student_objects, Key=lambda student:student.age) # Sort by age
[(' Dave ', ' B ',], (' Jane ', ' B ', '), (' John ', ' A ', 15)]

3) Operator module function

The key parameters above are used very widely, so Python provides some handy functions to make the access method easier and faster. The operator module has Itemgetter,attrgetter, and the Methodcaller method has been added since 2.6. Using these methods, the above actions will become more concise and faster:

Copy Code code as follows:

>>> from operator import Itemgetter, Attrgetter
>>> Sorted (Student_tuples, Key=itemgetter (2))
[(' Dave ', ' B ',], (' Jane ', ' B ', '), (' John ', ' A ', 15)]
>>> Sorted (student_objects, Key=attrgetter (' age '))
[(' Dave ', ' B ',], (' Jane ', ' B ', '), (' John ', ' A ', 15)]

The operator module also allows multi-level sorting, for example, first with grade and then by age:
Copy Code code as follows:

>>> Sorted (Student_tuples, Key=itemgetter (1,2))
[(' John ', ' A ',], (' Dave ', ' B ', '), (' Jane ', ' B ', 12)]
>>> Sorted (student_objects, Key=attrgetter (' Grade ', ' age '))
[(' John ', ' A ',], (' Dave ', ' B ', '), (' Jane ', ' B ', 12)]

4) Ascending and descending order

Both List.sort () and sorted () accept an argument reverse (True or False) to indicate ascending or descending sort. For example, the student descending order above is sorted as follows:

Copy Code code as follows:

>>> Sorted (Student_tuples, Key=itemgetter (2), reverse=true)
[(' John ', ' A ',], (' Jane ', ' B ', '), (' Dave ', ' B ', 10)]
>>> Sorted (student_objects, Key=attrgetter (' age '), reverse=true)
[(' John ', ' A ',], (' Jane ', ' B ', '), (' Dave ', ' B ', 10)]

5) sort of stability and complex sorting

Starting from python2.2, the sort is guaranteed to be stable. This means that if multiple elements have the same key, their order is unchanged before and after the sort.

Copy Code code as follows:

>>> data = [(' Red ', 1], (' Blue ', 1), (' Red ', 2), (' Blue ', 2)]
>>> sorted (data, Key=itemgetter (0))
[(' Blue ', 1], (' Blue ', 2), (' Red ', 1), (' Red ', 2)]

Notice that the order of ' blue ' is maintained after the sort, that is, ' blue ', 1 in front of ' Blue ', 2.

More complex you can build multiple steps to do more complex sorting, such as student data in descending order of grade, then in ascending order of age.
Copy Code code as follows:

>>> s = sorted (student_objects, Key=attrgetter (' age ')] # sort on secondary key
>>> sorted (S, Key=attrgetter (' Grade '), Reverse=true) # now sort on primary key, descending
[(' Dave ', ' B ',], (' Jane ', ' B ', '), (' John ', ' A ', 15)]

6 The most old-fashioned sort method-dsu

We call it DSU (decorate-sort-undecorate), because the sequencing process requires the following three steps:
First: The original list is decorated so that the value of the new list can be used to control the ordering;
Second: After the decoration of the list sort;
Third: The decoration will be deleted, the sorted decoration list is rebuilt to the original type list;

For example, use the DSU method to sort student data according to grade:
>>> decorated = [(Student.grade, I, student) for I, student in Enumerate (student_objects)]
>>> Decorated.sort ()
>>> [student for grade, I, student in decorated] # undecorate
[(' John ', ' A ',], (' Jane ', ' B ', '), (' Dave ', ' B ', 10)]
The above comparison works because tuples can be used to compare the first element of the comparison between tuples first, if the first one compares the second element, and so on.

Not all cases need to include indexes in the above tuples, but including indexes can have the following benefits:
First: The order is stable, if two elements have the same key, then their original sequence remains unchanged;
Second: The original element does not have to be used for comparisons because the first and second elements of tuples are sufficient for comparison.

This method is Randall. In Perl, his other name is also known as Schwartzian transform.

In cases where the elements of a large list or list are too complex to calculate, DSU is probably the quickest way to sort them before python2.4. But after 2.4, the key function explained above provides similar functionality.

(7) Other languages commonly used in the sorting method-cmp function

Before python2.4, the sorted () and List.sort () functions did not provide a key parameter, but CMP parameters were provided to allow the user to specify the comparison function. This method is also prevalent in other languages.

In python3.0, CMP parameters are completely removed to simplify and unify the language, reducing the conflict between advanced comparisons and __cmp__ methods.

The functions specified by the CMP parameter in python2.x are used for comparisons between elements. This function requires 2 parameters, and then returns a negative number representing less than, 0 is equal to, and a positive number indicates greater than. For example:

Copy Code code as follows:

>>> def numeric_compare (x, y):
return x-y
>>> sorted ([5, 2, 4, 1, 3], Cmp=numeric_compare)
[1, 2, 3, 4, 5]

Or you can sort in reverse order:
Copy Code code as follows:

>>> def reverse_numeric (x, y):
Return y-x
>>> sorted ([5, 2, 4, 1, 3], cmp=reverse_numeric)
[5, 4, 3, 2, 1]

As we migrate our existing 2.x code to 3.x, we need to convert the CMP function into a key function, and the following wrapper are helpful:

Copy Code code as follows:

def cmp_to_key (mycmp):
' Convert a cmp= function into a key= function '
Class K (object):
def __init__ (self, obj, *args):
Self.obj = obj
def __lt__ (self, Other):
Return mycmp (Self.obj, other.obj) < 0
def __gt__ (self, Other):
Return mycmp (Self.obj, other.obj) > 0
def __eq__ (self, Other):
Return mycmp (self.obj, other.obj) = 0
def __le__ (self, Other):
Return mycmp (Self.obj, other.obj) <= 0
def __ge__ (self, Other):
Return mycmp (Self.obj, other.obj) >= 0
def __ne__ (self, Other):
Return mycmp (Self.obj, other.obj)!= 0
Return K

When you need to convert CMP to key, you only need to:

Copy Code code as follows:

>>> sorted ([5, 2, 4, 1, 3], Key=cmp_to_key (reverse_numeric))
[5, 4, 3, 2, 1]

From the Python2.7,cmp_to_key () function is added to the Functools module.

8) Other matters needing attention

* You can use LOCALE.STRXFRM () as the key function or use Local.strcoll () as the CMP function for sorting that needs to be done regionally.

* The reverse parameter still retains the stability of the order, and interestingly, the same effect can be achieved two times using the reversed () function:

Copy Code code as follows:

>>> data = [(' Red ', 1], (' Blue ', 1), (' Red ', 2), (' Blue ', 2)]
>>> assert sorted (data, reverse=true) = = List (reversed (sorted (reversed (data)))

* In fact, the sort is internally called the __cmp__ of the element, so we can add the __cmp__ method for the element type so that the elements can be compared, for example:

Copy Code code as follows:

>>> student.__lt__ = lambda self, other:self.age < other.age
>>> Sorted (student_objects)
[(' Dave ', ' B ',], (' Jane ', ' B ', '), (' John ', ' A ', 15)]

* The key function can access not only the internal data that needs to be sorted, but also external resources, for example, if the student's score is stored in dictionary, you can use this dictionary to sort the list of student names as follows:
Copy Code code as follows:

>>> students = [' Dave ', ' John ', ' Jane ']
>>> newgrades = {' john ': ' F ', ' Jane ': ' A ', ' Dave ': ' C '}
>>> sorted (students, key=newgrades.__getitem__)
[' Jane ', ' Dave ', ' John ']

* Sort (), sorted (), or bisect.insort () are not the best way to do this when you are sorting the data. In this case, you can use Heap,red-black tree or treap.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.