Multiple ways to sort data with Python

Source: Internet
Author: User
Tags locale python list

Multiple ways to sort data with Python Catalogue

"Python HowTos Series" Sorting

The Python list has a built-in in-place sort method, List.sort (), and a built-in sorted () function to sort an iterative object (iterable) into a new ordered list.

In this article we will explore a variety of ways to sort data in Python.

Sort Basics

Simple ascending sorting is easy: just call the sorted () function and get an ordered new list:

You can also use the List.sort () method, which is an in-place sort (and returns None to avoid confusion). This is usually less convenient than sorted ()-but it's a little more efficient when you don't need to keep the original list.

Another difference is that the List.sort () method can only be used by the list, whereas the sorted () function can accept any iterator object (iterable).

Key function

Both List.sort () and sorted () have a key parameter that specifies what function is called to process the list element before being compared. For the example, here's a case-insensitive string comparison: For example, case-insensitive String comparisons:

The value of the key parameter should be a function that takes a parameter and returns a key that is used when sorting. This method is fast because each entry calls the key function only once.

A common pattern is to use the subscript of an object as key to sort complex objects. For example:

The same technique can be used on an object with a named attribute (named attributes). For example:

The key function pattern described above is very common, so Python provides some more simple and fast access to the properties of the function. The operator module has Itemgetter (), Attrgetter (), and Methodcaller () functions. Using those functions, the above examples become simpler and faster: Use these functions to make the above example more concise and efficient:

The operator module method allows for multilevel sorting. For example, you can sort by grade first, and then by age:

Both List.sort () and sorted () have Boolean reverse parameters that specify whether to descending. For example, sort student data by the descending order of age:

Ordering is guaranteed to be stable, meaning that when multiple records have the same key, the original order is preserved.

Notice that two blue records remain in their original order, so (' Blue ', 1) must be before (' Blue ', 2).

This awesome property allows you to sort through a series of sorts. For example, the student data is first sorted by grade ascending, and then by age descending, prioritizing age, and then sorting by grade:

The timsort algorithm used by Python can effectively take advantage of the order that is already in the data set, so it can be efficiently ordered in multiple levels.

Old ways to use Decorate-sort-undecorate

The name of the decorate-sort-undecorate is derived from the three steps of this method:

    • In the first step, the initial list is converted to get the new value for sorting.
    • The second step is to sort the list that is converted to the new value.
    • Finally, restore the data and get a list that contains only the original values after sorting.

For example, using the DSU (shorthand) method of decorate-sort-undecorate, sort student data by grade:

>>> decorated = [(Student.grade, I, student) for I, student in Enumerate (student_objects)]

>>> Decorated.sort ()

>>> [student for grade, I, student in decorated] # undecorate

[(' John ', ' A ', '), (' Jane ', ' B ', '), (' Dave ', ' B ', 10)]

This method takes advantage of the characteristics of tuples compared by dictionary order (lexicographically); compares the first item; If the first item is the same, the second item is compared, and so on.

In many cases it is not necessary to include the original subscript i in the processed list (decorated list), but there are two benefits of including the original subscript:

The sort is stable-if two items have the same key, the sorted list retains their order.

The original item does not need to be comparable because the processed tuple can determine the sort by up to the previous two items. For example, the original list contains complex numbers that cannot be directly compared.

There is another name for this method, which is named after the name Randal L. Schwartz, because he makes the transformation popular among Perl programmers.

After the Python sort provides the key function, this technique is no longer used.

Old methods for using CMP parameters

The methods given in this guide assume Python 2.4 or later. Prior to 2.4, sorted () and List.sort () had no key parameters. However, CMP parameters are supported in all py2.x versions to handle user-defined sorting functions.

In Py3.0, the CMP parameters have been completely removed (as part of a simplified and unified language, removing the conflict between sorting and CMP () magic methods).

In py2.x, sort allows an optional function to be passed in, which is called when the comparison is made. The function must accept two parameters for comparison, and then return a negative number that is less than, return 0 for equality, and return a positive number that is greater than. For example, we can do this:

Or you can reverse the comparison order:

When porting code from Python 2.x to 3.x, there may be situations where you need to convert a user-supplied sort function to a key function. The following wrappers are easy to do:

To the key function, you only need to wrap the old comparison function:

In Python 3.2, the Functools.cmp_to_key () function has been added to the Functools module of the standard library.

Other points

For time zone related sorting, use LOCALE.STRXFRM () as the key function, or use Locale.strcoll () as the comparison function.

The reverse parameter still maintains sort stability (so that items of the same key remain in the original order). Interestingly, you can simulate the same effect by calling the built-in reversed () function two times without passing in parameters:

When comparing two objects, sort uses the LT () method. Therefore, you can add a sort order to a class simply by adding the LT () method to the class:

The key function does not need to be directly dependent on the sorted object. The key function can access external resources. For example, if a student's score is kept in a dictionary, the data in the dictionary can be sorted by a single student name:

English Source: Andrew Dalke,raymond Hettinger

Article 36 Big Data, www.36dsj.com, DASHUJU36, 36 Big Data is a website dedicated to big Data entrepreneurship, big data technology and analytics, big data business and applications. Share Big Data dry and big data application cases, provide big data analysis tools and data download, solve the big data industry chain of entrepreneurship, technology, analysis, business, applications and other issues, for the big data industry chain of companies and data industry practitioners to provide support and services.

Multiple ways to sort data with Python

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.