International - English

Cart Console

Topic Center

Contact Sales

Home > Developer > Python

A detailed description of seven classic sorting algorithms based on Python

Last Update:2017-03-23 Source: Internet

Author: User

Tags sorts

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. Basic concepts and classifications for sorting

The so-called ordering is to make a series of records, according to the size of one or some of the keywords, increment or decrement the arrangement of operations. The sorting algorithm is how to make the records arranged according to the requirements of the method.

Sort Stability:

After some sort, if the two record ordinal is equal, and the order of the two in the original unordered record remains unchanged, it is said that the sorting method used is stable, and vice versa is unstable.

inner and outer sort

Sort: During sorting, all records to be sorted are all placed in memory

Sort: During the sorting process, external storage is used. The

usually discusses internal sorting.

Three factors that affect the performance of the In-order algorithm:

Time complexity: That is, time performance, high-efficiency sorting algorithm should have as few keyword comparisons and records of the number of moves

Space complexity: The main is the implementation of the algorithm needed to assist space, the less the better. The complexity of the

algorithm. The main point is the complexity of the code.

Depending on the main actions that are used during the sorting process, you can divide the inner sort into:

Insert sort

Interchange sort

Select sort

Merge sort

The is divided into two categories according to algorithmic complexity:

Simple algorithm: Includes bubble sort, simple select sort, and direct insert sort

Improved algorithm: Including Hill sort, heap sort, merge sort and quick sort

The following seven sorting algorithms are just the most classic of all sorting algorithms, not all.

Two, bubble sort

bubble sort (Bubble sort): Time complexity O (n^2)

Interchange sort. The core idea is that 22 compares the keywords of adjacent records, and if the reverse order is exchanged, until there is no reverse order.

The implementation details can be different, such as the following 3 kinds:

1. The simplest sort implementation: Bubble_sort_simple

2. Bubble sort: Bubble_sort

3. Improved bubbling sorting: Bubble_sort_advance

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# bubble sorting algorithm class Sqlist:def init (self, lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements that is convenient for later invocation. "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def bubble_sort_simple (self):" "The simplest sort of interchange, time-multiplexed        Degree O (n^2) "" "" lis = self.r length = Len (SELF.R) for I in range (length): For j in range (i+1, length): If lis[i] > Lis[j]: Self.swap (i, J) def Bubble_sort (self): "" "bubble sort, Time complexity O (n^2)" "" Lis = self. R length = Len (SELF.R) for I in range (length): j = length-2 while J >= i:if Lis[j] > lis[j+1 ]: Self.swap (J, j+1) J-= 1 def bubble_sort_advance (self): "" "bubble sort improved algorithm, time complexity O (n^2) set flag, when a round comparison    Does not occur in a swap, the following elements are in order.    For a more structured set of elements, a certain sort of efficiency can be improved.      "" "lis = self.r length = Len (SELF.R) flag = True i = 0 while I < length and Flag:flag = False j = Length-2 WHile J >= i:if Lis[j] > lis[j + 1]: Self.swap (J, j + 1) flag = True J-= 1 I + = 1 def str (self): ret = "" For I in Self.r:ret + = "%s"% i return retif name = = ' main ': SqList = sqlist ([4,1,7,3,8,5,9,2,6]) # sqlist.bubble_sort_simple () # Sqlist.bubble_sort () sqlist.bubble_sort_advance () print (sqlist )

Three simple selection sort

Simple Select sort (Easy selection sort): Time complexity O (n^2)

By comparison between N-i minor keywords, from n-i + 1 records to select the smallest record of the keyword, and to exchange with the records of the first (1<=i<=n).

It is common to say that all elements that have not yet been sorted are, from beginning to end, the subscript of the smallest element, that is, the position of the element. The element is then swapped to the front of the current traversal. Its efficiency is that each round is compared many times, but only once. So although its time complexity is also O (n^2), it is better than the bubbling algorithm.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# simple selection sort class SqList:  def init (self, Lis=non E):    SELF.R = Lis  def swap (self, I, j): ""    defines a method for exchanging elements that is convenient for later invocation. "" "    temp = self.r[i]    self.r[i] = self.r[j]    self.r[j] = temp  def select_sort (self):" "    simple Select sort , Time complexity O (n^2) "" ""    lis = self.r    length = Len (SELF.R) for    i in range (length):      minimum = i      For j in range (i+1, length):        if lis[minimum] > lis[j]:          minimum = j      if I! = Minimum:        self.swap (i, MI nimum)  def str:    ret = "" For I in    SELF.R:      ret + = "%s"% i    return retif name = = ' main ': 
  sqlist = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0])  sqlist.select_sort ()  print (sqlist)

Iv. Direct Insertion Sort

Direct Insert sort (straight insertion sort): Time complexity O (n^2)

The basic operation is to insert a record into an ordered table that is already sorted, resulting in a new, sequential table with a 1 increase in the number of records.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# Direct Insert sort class Sqlist:de  F init (self, lis=none): SELF.R = Lis def insert_sort (self): lis = SELF.R length = Len (SELF.R) # Subscript starting from 1 for I in range (1, length): If lis[i] < Lis[i-1]: temp = lis[i] J = i-1 and lis[j] > temp an D J >= 0:lis[j+1] = Lis[j] J-= 1 Lis[j+1] = temp def str (self): ret = "" For I in SEL F.r:ret + = "%s"% i return retif name = = ' main ': SqList = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0]) Sqlist.insert _sort () print (sqlist)

The algorithm requires a record of the secondary space. In the best case, when the original data is ordered, only one round of comparisons is required, and no movement of records is required, at which time the complexity is O (n). However, this is basically a fantasy.

v. Hill sort

Hill Sort (Shell sort) is an improved version of Insert sort, the core idea is to split the original data collection into several sub-sequences, Then the sub-sequences are directly inserted into the order, so that the subsequence is basically orderly, and then a direct insertion of the whole record is sorted.

The most critical here is the strategy of jumping and splitting, which is how we divide the data and how big the interval is. It is common to make a sub-sequence of records that are separated by an "increment" so as to ensure that the result is basically ordered rather than partially ordered after the direct insertion of the sequence within the subsequence. In the following example, the value of "increment" is determined by: increment = Int (INCREMENT/3) +1.

Hill Sort Time Complexity: O (n^ (3/2))

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# Hill sort class Sqlist:def Init (self, lis=none): SELF.R = Lis def shell_sort (self): "" "Hill Sort" "" lis = self.r length = Len (LIS) increme        NT = Len (LIS) while increment > 1:increment = Int (INCREMENT/3) +1 for I in range (increment+1, length):  If lis[i] < Lis[i-increment]: temp = lis[i] J = i-increment while J >= 0 and Temp < LIS[J]: lis[j+increment] = Lis[j] J-= increment lis[j+increment] = temp def str (sel  f): ret = "" For I in Self.r:ret + = "%s"% i return retif name = = ' main ': SqList = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0,123,22]) Sqlist.shell_sort () print (sqlist)

vi, heap sort

Heap is a complete binary tree with the following properties:

The value of each branch node is greater than or equal to the value of its left and right child, called the Big Top heap;

The value of each branch node is less than or equal to its right child's value, called the small top heap;

Therefore, its root node must be the maximum (minimum) value in all nodes.

If the nodes are numbered from 1 by the way the sequence is traversed (breadth first), the following relationship is satisfied between the nodes:

Heap sort (heap sort) is a method of sorting by the nature of a large top heap or small top heap. The overall time complexity for heap sorting is O (Nlogn). (This is the way of the Big Top heap)

Its core idea is to construct the sequence to be sorted into a large top heap. At this point, the maximum value of the entire sequence is the root node of the heap. Swap it with the end element of the heap array, and then reconstruct the remaining n-1 sequence into a large top heap. Perform the previous operation repeatedly, and finally get an ordered sequence.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# heap sort class sqlist:def init (self, lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements that is convenient for later invocation. "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def heap_sort (self): length = Len (SELF.R) i =    int (LENGTH/2) # constructs the original sequence into a large top heap # traversal from the middle, to the end of 0, these are actually the branch nodes of the heap.    While I >= 0:self.heap_adjust (i, length-1) I-= 1 # reverse-iterates through the sequence, constantly fetching the value of the root node and completing the actual sort. j = length-1 While J > 0: # Converts the current root node, which is the beginning of the list, subscript 0, to the last face J Self.swap (0, J) # to reconstruct the changed sequence into a large top heap s Elf.heap_adjust (0, j-1) J-= 1 def heap_adjust (self, S, m): "" "the core of the large-top heap construction method, maintaining the sequence of the heap structure.  "" "" lis = SELF.R temp = lis[s] i = 2*s while I <= m:if i < m and lis[i] < Lis[i+1]: i + =    1 if temp >= lis[i]: break lis[s] = Lis[i] s = i i *= 2 lis[s] = temp def str (self): ret = "" For I in Self.r:ret + = "%s"% i return retif nAME = = ' main ': SqList = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0, 123,]) Sqlist.heap_sort () print (sqlist)

The run time of the

Heap ordering is primarily consumed by the repeated filtering of the initial build heap and the rebuild heap. The

has a time complexity of O (n) for its initial build heap. When

is formally sorted, the time complexity of rebuilding the heap is O (Nlogn).

So the overall time complexity for heap sorting is O (Nlogn). The

Heap sort is insensitive to the sort state of the original record, so it is O (NLOGN) regardless of the best, worst, and average time complexity. Performance is better than bubbling, simple selection and direct insertion algorithm.

In space complexity, only one staging unit is required for exchange. However, because the comparison and exchange of records are jumping, heap ordering is also an unstable sort method.

In addition, because the initial build heap is relatively frequent, heap sorting does not fit into the sequence of fewer sequences.

Seven, merge sort

Merge sort (merging sort): An efficient sorting algorithm based on the merging operation, which uses the divide-and-conquer method (Pide and Conquer) is a very typical application. The ordered Subsequence is merged to obtain a fully ordered sequence, i.e., the order of each subsequence is ordered, and then the sequence of sub-sequences is ordered. If two ordered tables are combined into an ordered table, they are called two-way merging.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# merge sort class Sqlist:def init (self, lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements that is convenient for later invocation. "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def merge_sort (self): Self.msort (SELF.R, SELF.R,       0, Len (SELF.R)-1) def msort (self, LIST_SR, list_tr, S, t): temp = [None for i in range (0, Len (LIST_SR))] if s = = t:  List_tr[s] = list_sr[s] else:m = Int ((s+t)/2) Self.msort (LIST_SR, temp, S, m) Self.msort (LIST_SR,    Temp, m+1, T) self.merge (temp, list_tr, S, M, T) def merge (self, list_sr, list_tr, I, M, n): j = m+1 k = i While I <= m and J <= N:if List_sr[i] < List_sr[j]: list_tr[k] = list_sr[i] I + = 1 Else : list_tr[k] = List_sr[j] J + = 1 K + = 1 if I <= m:for L in range (0, m-i+1): list_tr[ K+L] = list_sr[i+l] If J <= n:for L in range (0, n-j+1): list_tr[K+L] = list_sr[j+l] def str (self): ret = "" For I in Self.r:ret + = "%s"% i return retif name = = ' main ': SqList = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0,,, +, +]) Sqlist.merge_sort () print (sqlist)

The

Merge sort is not sensitive to the distribution of the original sequence elements and has a time complexity of O (Nlogn). The

Merge sort requires a certain amount of secondary space to be used for recursion and storing the results, so its spatial complexity is O (N+LOGN).

There is no jump in the merge sort, only 22 comparisons, so it is a stable sort.

In summary, a merge sort is an algorithm that is more memory-intensive, but highly efficient and stable.

Eight, quick sort

Quick Sort (quick sorts) invented by Turing Award winner Tony Hoare, was listed as one of the top ten algorithms of the 20th century. An upgraded version of the bubbling sort, one of the exchange sorts. The time complexity for quick sorting is O (Nlog (n)). The core idea of the

Fast sorting algorithm is to split the pending records into separate two parts by a single pass, where some of the recorded keywords are smaller than the other ones, and then the two sections are sorted to achieve the sorting purpose of the entire set of records.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# Quick Sort class sqlist:def init (self, lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements that is convenient for later invocation.  "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def quick_sort (self):" "Call Portal" "" Self.qsort (0,      Len (SELF.R)-1) def qsort (self, Low, high): "" "Recursive Call" "" if low < High:pivot = Self.partition (Low, high)    Self.qsort (Low, pivot-1) Self.qsort (pivot+1, high) def partition (self, Low, high): "" The core code of the quick sort.    In fact, will choose the Pivot_key constantly exchange, will be smaller than it to the left, will be bigger than it to the right.    It also changes its position in exchange until all the exchanges have been completed.    In the process of a function call, however, the value of Pivot_key is always the same.    :p Aram Low: Left border subscript:p Aram High: Right boundary subscript: return: Subscript "" "" "lis = self.r Pivot_key = Lis[low in pivot_key position after left      While low < High:while low < high and Lis[high] >= pivot_key:high-= 1 self.swap (low, high) While low < high and Lis[low] <= pivot_key:low + = 1 self.swap (low, high) RETURN Low def str (self): ret = "" For I in Self.r:ret + = "%s"% i return retif name = = ' main ': SqList = SQ List ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0, 123, Sqlist.quick_sort]) print (sqlist)

The time performance of a quick sort depends on the depth of recursion.

When the pivot_key happens to be in the middle of the record key, the division of the two regions is more balanced, close to a balanced binary tree, at which time the complexity is O (Nlog (n)).

When the original record set is a positive or reverse order, the result of the partition is a diagonal tree with a depth of n-1, each execution of a size partition, with a n-i comparison, with a final time complexity of O (n^2).

In general, it can be proved by mathematical induction that the time complexity of fast sequencing is O (Nlog (n)).

However, because the comparison and exchange of keywords are jumping, fast sorting is an unstable sort.

At the same time, because of recursive technology, the algorithm needs some auxiliary space, and its space complexity is O (logn).

Basic quick sorting also has a place to optimize:

1. Optimize the selected Pivot_key

Each time we select Pivot_key is the first element of the subsequence, that is, Lis[low], which is a comparison of luck. When you are lucky, the value is close to the middle value of the entire sequence, the constructed tree is more balanced, the luck is poor, and the tree near the maximum or minimum position is close to the oblique tree.

In order to ensure that the Pivot_key selection is as moderate as possible, taking the values of the three special positions in the left and right of the selected sequence, the number in the middle value is Pivot_key, which is usually better than the direct use of lis[low]. In the code, add the following code in front of the original Pivot_key = Lis[low] Line:

m = low + int ((high-low)/2) if Lis[low] > Lis[high]:  self.swap (Low, high) if LIS[M] > Lis[high]:  self.swap (hi GH, M) if lis[m] > Lis[low]:  Self.swap (M, low)

If this is not good enough, you can also divide the entire sequence into 3 parts, each to find a pivot_key, and then 3 Pivot_key to do a comparison of the above to obtain the final pivot_key. At this time the Pivot_key should be very large probability is a more reliable value.

2. Reduction of unnecessary exchange

The original code in the Pivot_key this record is always in the exchange, in fact, this is not necessary, it is completely possible to temporarily exist in a temporary variable, as follows:

def partition (self, Low, high):        lis = SELF.R    m = low + int ((high-low)/2)    if Lis[low] > Lis[high]: Self      . Swap (low, high)    if lis[m] > Lis[high]:      self.swap (High, m)    if lis[m] > Lis[low]:      Self.swap (M, low)    Pivot_key = lis[low]    # Temp Staging pivot_key value    temp = Pivot_key while low    < high: when Low      < hi GH and Lis[high] >= pivot_key: High        -= 1      # Direct replacement without swapping      lis[low] = Lis[high] While low      < high and Lis[low] <= pivot_key: Low        + + 1      lis[high] = Lis[low]      lis[low] = temp    return low

3. Sorting when fractional groups are optimized

The recursive operation of the fast sorting algorithm can be accepted and the cost is faster when the large amount of data is sorted. But the decimal group sorting is not as fast as the direct insertion sort, that is, overkill, not necessarily faster than the chopper.

Therefore, a very simple way is based on the number of data, do a choice of which algorithm to use, the following rewrite Qsort method:

def qsort (self, Low, high): "" "  depending on the length of the sequence, choose whether to use quick sort or simple insert sort" "" "  # 7 is an empirical value that can be determined according to the actual situation.  max_length = 7  If high-low < max_length:    if low < high:      pivot = self.partition      . qsort (Low, pivot-1)      self.qsort (pivot + 1, high)  else:    # Insert_sort method is the simple insert sort algorithm    we wrote earlier Self.insert_sort ()

4. Optimize recursive operations

The recursive operation of the whole algorithm can be optimized by means of tail recursion, and the Qsort method is rewritten as follows:

def qsort (self, Low, high): "" "  depending on the length of the sequence, choose whether to use quick sort or simple insert sort" "" "  # 7 is an empirical value that can be determined according to the actual situation.  max_length = 7  If high-low < max_length: # instead of using while loop while low    < high:      pivot = self.partition ( Low, high)      self.qsort (Low, pivot-1)      # Adopts the way of tail recursion low      = pivot + 1  else:    # Insert_ The sort method is the simple insert sort algorithm we wrote earlier    Self.insert_sort ()

Nine, sorting algorithm summary

Classification of sorting algorithms:

There is no perfect algorithm, there will be some shortcomings, even the fast sorting algorithm, but also the overall performance of the superior, there is a sort of unstable, need a lot of auxiliary space, not suitable for a small amount of data sorting and other shortcomings.

Performance comparison of seven sorting algorithms

If the backlog sequence is basically sequential, use simple algorithms directly, rather than using sophisticated algorithms for improvement.

Merge sort and quick sort are high performance, but require more secondary space. is to use space to change time.

The lower the number of elements in the sequence, the more suitable the simple sorting method, the more the number of elements, the more suitable to use the improved sorting algorithm.

Simple selection sorting is not good at time performance, but it has high performance in space utilization. Particularly suitable, those data volume is not small, the amount of information per piece of data is more than the sort of elements.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

sorting algorithms c sorting algorithms in c with code what date of easter based on all types of sorting sorting of numbers different type of sorting sorting array of strings

Python thread pause, resume, exit detail and Example _python 01-18

Python design mode-UML-Package diagrams (Package Diagram) 09-09

Python abstract class (ABC module) 09-18

The difference between OS and sys two modules in Python 04-05

Python: send emails 12-08

Python: Database Operations 12-08

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A detailed description of seven classic sorting algorithms based on Python

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support