A detailed description of seven classic sorting algorithms based on Python

Source: Internet
Author: User
Tags sorts

I. Basic concepts and classifications for sorting

The so-called ordering is to make a series of records, according to the size of one or some of the keywords, increment or decrement the arrangement of operations. The sorting algorithm is how to make the records arranged according to the requirements of the method.

Sort Stability:

After some sort, if the two record ordinal is equal, and the order of the two in the original unordered record remains unchanged, it is said that the sorting method used is stable, and vice versa is unstable.

inner and outer sort

Sort: During sorting, all records to be sorted are all placed in memory

Sort: During the sorting process, external storage is used. The

usually discusses internal sorting.

Three factors that affect the performance of the In-order algorithm:

Time complexity: That is, time performance, high-efficiency sorting algorithm should have as few keyword comparisons and records of the number of moves

Space complexity: The main is the implementation of the algorithm needed to assist space, the less the better. The complexity of the

algorithm. The main point is the complexity of the code.

Depending on the main actions that are used during the sorting process, you can divide the inner sort into:

Insert sort

Interchange sort

Select sort

Merge sort

The is divided into two categories according to algorithmic complexity:

Simple algorithm: Includes bubble sort, simple select sort, and direct insert sort

Improved algorithm: Including Hill sort, heap sort, merge sort and quick sort

The following seven sorting algorithms are just the most classic of all sorting algorithms, not all.

Two, bubble sort

bubble sort (Bubble sort): Time complexity O (n^2)

Interchange sort. The core idea is that 22 compares the keywords of adjacent records, and if the reverse order is exchanged, until there is no reverse order.

The implementation details can be different, such as the following 3 kinds:

1. The simplest sort implementation: Bubble_sort_simple

2. Bubble sort: Bubble_sort

3. Improved bubbling sorting: Bubble_sort_advance

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# bubble sorting algorithm class Sqlist:def init (self, lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements that is convenient for later invocation. "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def bubble_sort_simple (self):" "The simplest sort of interchange, time-multiplexed        Degree O (n^2) "" "" lis = self.r length = Len (SELF.R) for I in range (length): For j in range (i+1, length): If lis[i] > Lis[j]: Self.swap (i, J) def Bubble_sort (self): "" "bubble sort, Time complexity O (n^2)" "" Lis = self. R length = Len (SELF.R) for I in range (length): j = length-2 while J >= i:if Lis[j] > lis[j+1 ]: Self.swap (J, j+1) J-= 1 def bubble_sort_advance (self): "" "bubble sort improved algorithm, time complexity O (n^2) set flag, when a round comparison    Does not occur in a swap, the following elements are in order.    For a more structured set of elements, a certain sort of efficiency can be improved.      "" "lis = self.r length = Len (SELF.R) flag = True i = 0 while I < length and Flag:flag = False j = Length-2 WHile J >= i:if Lis[j] > lis[j + 1]: Self.swap (J, j + 1) flag = True J-= 1 I + = 1 def str (self): ret = "" For I in Self.r:ret + = "%s"% i return retif name = = ' main ': SqList = sqlist ([4,1,7,3,8,5,9,2,6]) # sqlist.bubble_sort_simple () # Sqlist.bubble_sort () sqlist.bubble_sort_advance () print (sqlist )



Three simple selection sort

Simple Select sort (Easy selection sort): Time complexity O (n^2)

By comparison between N-i minor keywords, from n-i + 1 records to select the smallest record of the keyword, and to exchange with the records of the first (1<=i<=n).

It is common to say that all elements that have not yet been sorted are, from beginning to end, the subscript of the smallest element, that is, the position of the element. The element is then swapped to the front of the current traversal. Its efficiency is that each round is compared many times, but only once. So although its time complexity is also O (n^2), it is better than the bubbling algorithm.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# simple selection sort class SqList:  def init (self, Lis=non E):    SELF.R = Lis  def swap (self, I, j): ""    defines a method for exchanging elements that is convenient for later invocation. "" "    temp = self.r[i]    self.r[i] = self.r[j]    self.r[j] = temp  def select_sort (self):" "    simple Select sort , Time complexity O (n^2) "" ""    lis = self.r    length = Len (SELF.R) for    i in range (length):      minimum = i      For j in range (i+1, length):        if lis[minimum] > lis[j]:          minimum = j      if I! = Minimum:        self.swap (i, MI nimum)  def str:    ret = "" For I in    SELF.R:      ret + = "%s"% i    return retif name = = ' main ': 
  sqlist = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0])  sqlist.select_sort ()  print (sqlist)



Iv. Direct Insertion Sort

Direct Insert sort (straight insertion sort): Time complexity O (n^2)

The basic operation is to insert a record into an ordered table that is already sorted, resulting in a new, sequential table with a 1 increase in the number of records.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# Direct Insert sort class Sqlist:de  F init (self, lis=none): SELF.R = Lis def insert_sort (self): lis = SELF.R length = Len (SELF.R) # Subscript starting from 1 for I in range (1, length): If lis[i] < Lis[i-1]: temp = lis[i] J = i-1 and lis[j] > temp an D J >= 0:lis[j+1] = Lis[j] J-= 1 Lis[j+1] = temp def str (self): ret = "" For I in SEL F.r:ret + = "%s"% i return retif name = = ' main ': SqList = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0]) Sqlist.insert _sort () print (sqlist) 



The algorithm requires a record of the secondary space. In the best case, when the original data is ordered, only one round of comparisons is required, and no movement of records is required, at which time the complexity is O (n). However, this is basically a fantasy.



v. Hill sort

Hill Sort (Shell sort) is an improved version of Insert sort, the core idea is to split the original data collection into several sub-sequences, Then the sub-sequences are directly inserted into the order, so that the subsequence is basically orderly, and then a direct insertion of the whole record is sorted.

The most critical here is the strategy of jumping and splitting, which is how we divide the data and how big the interval is. It is common to make a sub-sequence of records that are separated by an "increment" so as to ensure that the result is basically ordered rather than partially ordered after the direct insertion of the sequence within the subsequence. In the following example, the value of "increment" is determined by: increment = Int (INCREMENT/3) +1.

Hill Sort Time Complexity: O (n^ (3/2))

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# Hill sort class Sqlist:def Init (self, lis=none): SELF.R = Lis def shell_sort (self): "" "Hill Sort" "" lis = self.r length = Len (LIS) increme        NT = Len (LIS) while increment > 1:increment = Int (INCREMENT/3) +1 for I in range (increment+1, length):  If lis[i] < Lis[i-increment]: temp = lis[i] J = i-increment while J >= 0 and Temp < LIS[J]: lis[j+increment] = Lis[j] J-= increment lis[j+increment] = temp def str (sel  f): ret = "" For I in Self.r:ret + = "%s"% i return retif name = = ' main ': SqList = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0,123,22]) Sqlist.shell_sort () print (sqlist) 



vi, heap sort

Heap is a complete binary tree with the following properties:

The value of each branch node is greater than or equal to the value of its left and right child, called the Big Top heap;

The value of each branch node is less than or equal to its right child's value, called the small top heap;

Therefore, its root node must be the maximum (minimum) value in all nodes.



If the nodes are numbered from 1 by the way the sequence is traversed (breadth first), the following relationship is satisfied between the nodes:



Heap sort (heap sort) is a method of sorting by the nature of a large top heap or small top heap. The overall time complexity for heap sorting is O (Nlogn). (This is the way of the Big Top heap)

Its core idea is to construct the sequence to be sorted into a large top heap. At this point, the maximum value of the entire sequence is the root node of the heap. Swap it with the end element of the heap array, and then reconstruct the remaining n-1 sequence into a large top heap. Perform the previous operation repeatedly, and finally get an ordered sequence.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# heap sort class sqlist:def init (self, lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements that is convenient for later invocation. "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def heap_sort (self): length = Len (SELF.R) i =    int (LENGTH/2) # constructs the original sequence into a large top heap # traversal from the middle, to the end of 0, these are actually the branch nodes of the heap.    While I >= 0:self.heap_adjust (i, length-1) I-= 1 # reverse-iterates through the sequence, constantly fetching the value of the root node and completing the actual sort. j = length-1 While J > 0: # Converts the current root node, which is the beginning of the list, subscript 0, to the last face J Self.swap (0, J) # to reconstruct the changed sequence into a large top heap s Elf.heap_adjust (0, j-1) J-= 1 def heap_adjust (self, S, m): "" "the core of the large-top heap construction method, maintaining the sequence of the heap structure.  "" "" lis = SELF.R temp = lis[s] i = 2*s while I <= m:if i < m and lis[i] < Lis[i+1]: i + =    1 if temp >= lis[i]: break lis[s] = Lis[i] s = i i *= 2 lis[s] = temp def str (self): ret = "" For I in Self.r:ret + = "%s"% i return retif nAME = = ' main ': SqList = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0, 123,]) Sqlist.heap_sort () print (sqlist) 

The run time of the


Heap ordering is primarily consumed by the repeated filtering of the initial build heap and the rebuild heap. The

has a time complexity of O (n) for its initial build heap. When

is formally sorted, the time complexity of rebuilding the heap is O (Nlogn).

So the overall time complexity for heap sorting is O (Nlogn). The

Heap sort is insensitive to the sort state of the original record, so it is O (NLOGN) regardless of the best, worst, and average time complexity. Performance is better than bubbling, simple selection and direct insertion algorithm.

In space complexity, only one staging unit is required for exchange. However, because the comparison and exchange of records are jumping, heap ordering is also an unstable sort method.

In addition, because the initial build heap is relatively frequent, heap sorting does not fit into the sequence of fewer sequences.

Seven, merge sort

Merge sort (merging sort): An efficient sorting algorithm based on the merging operation, which uses the divide-and-conquer method (Pide and Conquer) is a very typical application. The ordered Subsequence is merged to obtain a fully ordered sequence, i.e., the order of each subsequence is ordered, and then the sequence of sub-sequences is ordered. If two ordered tables are combined into an ordered table, they are called two-way merging.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# merge sort class Sqlist:def init (self, lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements that is convenient for later invocation. "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def merge_sort (self): Self.msort (SELF.R, SELF.R,       0, Len (SELF.R)-1) def msort (self, LIST_SR, list_tr, S, t): temp = [None for i in range (0, Len (LIST_SR))] if s = = t:  List_tr[s] = list_sr[s] else:m = Int ((s+t)/2) Self.msort (LIST_SR, temp, S, m) Self.msort (LIST_SR,    Temp, m+1, T) self.merge (temp, list_tr, S, M, T) def merge (self, list_sr, list_tr, I, M, n): j = m+1 k = i While I <= m and J <= N:if List_sr[i] < List_sr[j]: list_tr[k] = list_sr[i] I + = 1 Else : list_tr[k] = List_sr[j] J + = 1 K + = 1 if I <= m:for L in range (0, m-i+1): list_tr[ K+L] = list_sr[i+l] If J <= n:for L in range (0, n-j+1): list_tr[K+L] = list_sr[j+l] def str (self): ret = "" For I in Self.r:ret + = "%s"% i return retif name = = ' main ': SqList = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0,,, +, +]) Sqlist.merge_sort () print (sqlist)

The



Merge sort is not sensitive to the distribution of the original sequence elements and has a time complexity of O (Nlogn). The

Merge sort requires a certain amount of secondary space to be used for recursion and storing the results, so its spatial complexity is O (N+LOGN).

There is no jump in the merge sort, only 22 comparisons, so it is a stable sort.

In summary, a merge sort is an algorithm that is more memory-intensive, but highly efficient and stable.

Eight, quick sort

Quick Sort (quick sorts) invented by Turing Award winner Tony Hoare, was listed as one of the top ten algorithms of the 20th century. An upgraded version of the bubbling sort, one of the exchange sorts. The time complexity for quick sorting is O (Nlog (n)). The core idea of the

Fast sorting algorithm is to split the pending records into separate two parts by a single pass, where some of the recorded keywords are smaller than the other ones, and then the two sections are sorted to achieve the sorting purpose of the entire set of records.

#!/usr/bin/env python#-*-coding:utf-8-*-# author:liu jiang# python 3.5# Quick Sort class sqlist:def init (self, lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements that is convenient for later invocation.  "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def quick_sort (self):" "Call Portal" "" Self.qsort (0,      Len (SELF.R)-1) def qsort (self, Low, high): "" "Recursive Call" "" if low < High:pivot = Self.partition (Low, high)    Self.qsort (Low, pivot-1) Self.qsort (pivot+1, high) def partition (self, Low, high): "" The core code of the quick sort.    In fact, will choose the Pivot_key constantly exchange, will be smaller than it to the left, will be bigger than it to the right.    It also changes its position in exchange until all the exchanges have been completed.    In the process of a function call, however, the value of Pivot_key is always the same.    :p Aram Low: Left border subscript:p Aram High: Right boundary subscript: return: Subscript "" "" "lis = self.r Pivot_key = Lis[low in pivot_key position after left      While low < High:while low < high and Lis[high] >= pivot_key:high-= 1 self.swap (low, high) While low < high and Lis[low] <= pivot_key:low + = 1 self.swap (low, high) RETURN Low def str (self): ret = "" For I in Self.r:ret + = "%s"% i return retif name = = ' main ': SqList = SQ List ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0, 123, Sqlist.quick_sort]) print (sqlist)


The time performance of a quick sort depends on the depth of recursion.

When the pivot_key happens to be in the middle of the record key, the division of the two regions is more balanced, close to a balanced binary tree, at which time the complexity is O (Nlog (n)).

When the original record set is a positive or reverse order, the result of the partition is a diagonal tree with a depth of n-1, each execution of a size partition, with a n-i comparison, with a final time complexity of O (n^2).

In general, it can be proved by mathematical induction that the time complexity of fast sequencing is O (Nlog (n)).

However, because the comparison and exchange of keywords are jumping, fast sorting is an unstable sort.

At the same time, because of recursive technology, the algorithm needs some auxiliary space, and its space complexity is O (logn).

Basic quick sorting also has a place to optimize:

1. Optimize the selected Pivot_key

Each time we select Pivot_key is the first element of the subsequence, that is, Lis[low], which is a comparison of luck. When you are lucky, the value is close to the middle value of the entire sequence, the constructed tree is more balanced, the luck is poor, and the tree near the maximum or minimum position is close to the oblique tree.

In order to ensure that the Pivot_key selection is as moderate as possible, taking the values of the three special positions in the left and right of the selected sequence, the number in the middle value is Pivot_key, which is usually better than the direct use of lis[low]. In the code, add the following code in front of the original Pivot_key = Lis[low] Line:

m = low + int ((high-low)/2) if Lis[low] > Lis[high]:  self.swap (Low, high) if LIS[M] > Lis[high]:  self.swap (hi GH, M) if lis[m] > Lis[low]:  Self.swap (M, low)



If this is not good enough, you can also divide the entire sequence into 3 parts, each to find a pivot_key, and then 3 Pivot_key to do a comparison of the above to obtain the final pivot_key. At this time the Pivot_key should be very large probability is a more reliable value.

2. Reduction of unnecessary exchange

The original code in the Pivot_key this record is always in the exchange, in fact, this is not necessary, it is completely possible to temporarily exist in a temporary variable, as follows:

def partition (self, Low, high):        lis = SELF.R    m = low + int ((high-low)/2)    if Lis[low] > Lis[high]: Self      . Swap (low, high)    if lis[m] > Lis[high]:      self.swap (High, m)    if lis[m] > Lis[low]:      Self.swap (M, low)    Pivot_key = lis[low]    # Temp Staging pivot_key value    temp = Pivot_key while low    < high: when Low      < hi GH and Lis[high] >= pivot_key: High        -= 1      # Direct replacement without swapping      lis[low] = Lis[high] While low      < high and Lis[low] <= pivot_key: Low        + + 1      lis[high] = Lis[low]      lis[low] = temp    return low



3. Sorting when fractional groups are optimized

The recursive operation of the fast sorting algorithm can be accepted and the cost is faster when the large amount of data is sorted. But the decimal group sorting is not as fast as the direct insertion sort, that is, overkill, not necessarily faster than the chopper.

Therefore, a very simple way is based on the number of data, do a choice of which algorithm to use, the following rewrite Qsort method:

def qsort (self, Low, high): "" "  depending on the length of the sequence, choose whether to use quick sort or simple insert sort" "" "  # 7 is an empirical value that can be determined according to the actual situation.  max_length = 7  If high-low < max_length:    if low < high:      pivot = self.partition      . qsort (Low, pivot-1)      self.qsort (pivot + 1, high)  else:    # Insert_sort method is the simple insert sort algorithm    we wrote earlier Self.insert_sort ()


4. Optimize recursive operations

The recursive operation of the whole algorithm can be optimized by means of tail recursion, and the Qsort method is rewritten as follows:

def qsort (self, Low, high): "" "  depending on the length of the sequence, choose whether to use quick sort or simple insert sort" "" "  # 7 is an empirical value that can be determined according to the actual situation.  max_length = 7  If high-low < max_length: # instead of using while loop while low    < high:      pivot = self.partition ( Low, high)      self.qsort (Low, pivot-1)      # Adopts the way of tail recursion low      = pivot + 1  else:    # Insert_ The sort method is the simple insert sort algorithm we wrote earlier    Self.insert_sort ()



Nine, sorting algorithm summary

Classification of sorting algorithms:




There is no perfect algorithm, there will be some shortcomings, even the fast sorting algorithm, but also the overall performance of the superior, there is a sort of unstable, need a lot of auxiliary space, not suitable for a small amount of data sorting and other shortcomings.

Performance comparison of seven sorting algorithms



If the backlog sequence is basically sequential, use simple algorithms directly, rather than using sophisticated algorithms for improvement.

Merge sort and quick sort are high performance, but require more secondary space. is to use space to change time.

The lower the number of elements in the sequence, the more suitable the simple sorting method, the more the number of elements, the more suitable to use the improved sorting algorithm.

Simple selection sorting is not good at time performance, but it has high performance in space utilization. Particularly suitable, those data volume is not small, the amount of information per piece of data is more than the sort of elements.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.