Seven classical sorting algorithms based on Python (recommended) _python

Source: Internet
Author: User

First, the basic concept and classification of sorting

The so-called sort, is to make a series of records, according to one or some of the key words of the size, incremental or descending arrangement of the operation. The sorting algorithm is how to make the records in the way that they are ordered.

Stability of the sort:

After some sort, if two records are equal, and the order of the two in the original unordered record remains unchanged, the ranking method used is stable and the other is unstable.

Inner sort and outer sort

Inner sort: All records to be sorted are placed in memory during the sort process

Sort out: In the sort process, the external store is used.

What is usually discussed is the inner sort.

Three factors that affect the performance of the internal sorting algorithm:

    • Time Complexity: Time performance, efficient sorting algorithm should be with as little as possible the number of keyword comparisons and records of the number of moves
    • Space complexity: The main implementation of the algorithm required by the auxiliary space, the less the better.
    • Algorithmic complexity. The main point is the complexity of the code.

According to the main operation of the sorting process, the inner sort can be divided into:

    • Insert Sort
    • Exchange sort
    • Select sort
    • Merge sort

According to the complexity of the algorithm can be divided into two categories:

    • Simple algorithms: Include bubble sort, simple select sort, and direct insert sort
    • Improved algorithm: Including Hill sort, heap sort, merge sort and quick sort

The following seven sorting algorithms are just the most classic of all sorting algorithms and do not represent all of them.

Second, bubble sort

Bubble sort (Bubble sort): Time complexity O (n^2)

A sort of exchange. Its core idea is: 22 compare the keywords of adjacent records, if the reverse order is exchanged until there is no reverse sequence record.

The implementation details can be different, such as the following 3 kinds:

1. The simplest sort implementation: bubble_sort_simple

2. Bubble Sort: bubble_sort

3. Improved bubble sort: bubble_sort_advance

#!/usr/bin/env python #-*-coding:utf-8-*-# author:liu Jiang # python 3.5 # Bubble Sort Algorithm class Sqlist:def __init__ (self , lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements to facilitate subsequent invocation. "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def bubble_sort_simple (self):" "" the simplest intersection Change Order, Time complexity O (n^2) "" "lis = self.r length = Len (SELF.R) for I in range (length): For j in range (I+1, le
    Ngth): If lis[i] > Lis[j]: Self.swap (i, J) def Bubble_sort (self): "" "bubble sort, Time complexity O (n^2)
        "" "lis = self.r length = Len (SELF.R) for I in range (length): j = length-2 while J >= I: If LIS[J] > lis[j+1]: Self.swap (J, j+1) J-= 1 def bubble_sort_advance (self): "" "bubble sort Change
    Algorithm, Time complexity O (n^2) set flag, when a round of comparisons did not occur Exchange action, then the following elements are actually ordered.
    For a more regular set of elements, some sort efficiency can be improved. "" "lis = self.r length = Len (SELF.R) flag = True i = 0 while I < length and Flag:flag = False J = length-2 while J >= i:if Lis[j] > lis[j + 1]: sel 
      F.swap (J, j + 1) flag = True J-= 1 i = 1 def __str__ (self): ret = "" For I in SELF.R: ret = "%s"% i return ret if __name__ = ' __main__ ': sqlist = SqList ([4,1,7,3,8,5,9,2,6]) # SQLIST.BUBBL

 E_sort_simple () # Sqlist.bubble_sort () sqlist.bubble_sort_advance () print (sqlist)

Three, simple choice of sorting

Simple selection sort (Simplicity selection sort): Time complexity O (n^2)

By n-i The comparison between the secondary keywords, the smallest records are selected from the N-i+1 records and exchanged with the first (1<=i<=n) records.

In layman's parlance, all elements that have not yet been sorted are recorded, from beginning to end, to the lower subscript of the smallest element, which is the position of the element. The element is then exchanged to the front of the current traversal. Its efficiency is that each round is compared many times, but only once. So while its time complexity is also O (n^2), it is better than the bubbling algorithm.

#!/usr/bin/env python
#-*-coding:utf-8-*-
# author:liu Jiang
# python 3.5
# Simple Select sort


class sqlist:< C6/>def __init__ (Self, lis=none):
    self.r = Lis

  def swap (self, I, j): ""
    defines a method of exchanging elements to facilitate subsequent invocation. "" "
    temp = self.r[i]
    self.r[i] = self.r[j]
    self.r[j] = temp

  def select_sort (self):
    " " Simple selection sort, time complexity O (n^2) "" "
    lis = self.r
    length = Len (SELF.R) for
    i in range (length):
      minimum = i< C21/>for j in range (i+1, length):
        if lis[minimum] > lis[j]:
          minimum = j
      If I!= minimum:
        self.swap ( I, minimum)

  def __str__ (self):
    ret = "" For
    I in SELF.R:
      ret + + "%s"% i return
    ret

if __na me__ = = ' __main__ ':
  sqlist = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0])
  sqlist.select_sort ()
  print (sqlist)

Four, direct insertion sort

Direct insertion Sort (Straight insertion sort): Time complexity O (n^2)

The basic operation is to insert a record into an ordered table that has been sorted so that a new ordered table with a record number of 1 is added.

#!/usr/bin/env python
#-*-coding:utf-8-*-
# author:liu Jiang
# python 3.5
# Direct Insert Sort


class sqlist:< C6/>def __init__ (Self, lis=none):
    self.r = Lis

  def insert_sort (self):
    lis = self.r
    length = Len (self.r
    # Subscript Starting with 1
    for I in range (1, length):
      if lis[i] < Lis[i-1]:
        temp = lis[i]
        j = i-1 while
        l IS[J] > Temp and J >= 0:
          lis[j+1] = lis[j]
          J-= 1
        lis[j+1] = temp

  def __str__ (self):
    ret =  ""
    for I in SELF.R:
      ret + "%s"% i return
    ret

if __name__ = = ' __main__ ':
  sqlist = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0])
  sqlist.insert_sort ()
  print (sqlist)

This algorithm requires a record of the auxiliary space. In the best case, when the raw data is ordered, only one round of comparisons is required, and no records need to be moved, at which point the time complexity is O (n). However, this is basically a fantasy.

Five, Hill sort

Hill sort (Shell sort) is an improved version of the insertion sort, whose core idea is to divide the original data set into several sub sequences, then insert the sequence directly into the subsequence, so that the subsequence is ordered, and then the whole record is inserted in a direct sequence.

The key here is the jump and segmentation strategy, which is how we want to split the data, how much interval problem. It is common to make a subsequence of a record that is separated by an "increment" so that the result of the direct insertion order within the subsequence is basically ordered rather than locally ordered. The following example uses: increment = Int (INCREMENT/3) +1 to determine the value of "increment".

The time complexity of hill sort is: O (n^ (3/2))

#!/usr/bin/env python
#-*-coding:utf-8-*-
# author:liu Jiang
# python 3.5
# Hill Sort


Class Sqlist:
   def __init__ (Self, lis=none):
    self.r = Lis

  def shell_sort (self): "" ""
    Hill Sort "" "
    lis = SELF.R
    length = Len (lis)
    increment = Len (lis) while
    increment > 1:
      increment = Int (INCREMENT/3) +1 for
      i I  N Range (increment+1, length):
        if lis[i] < Lis[i-increment]:
          temp = lis[i]
          j = i-increment
          while J >= 0 and Temp < LIS[J]:
            lis[j+increment] = lis[j]
            J-= Increment
          lis[j+increment] = temp

  def __str__ (self):
    ret = "" For
    I in SELF.R:
      ret + "%s"% i return
    ret

if __name__ = ' __main__ ': 
   sqlist = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0,123,22])
  sqlist.shell_sort ()
  print (sqlist)

Vi. Heap Sequencing

A heap is a complete binary tree having the following properties:

The value of each branch node is greater than or equal to the value of its left and right child, called a large heap;

The value of each branch node is less than or equal to its right child's value, called a small top heap;

Therefore, its root node must be the largest (smallest) value in all nodes.

If the node is numbered from 1 in the way that the sequence is traversed (breadth first), the following relationships are satisfied:

Heap ordering (Heap sort) is the way to sort by the nature of a large top heap or a small top heap. The overall time complexity of heap sorting is O (Nlogn). (In the way of the big Top heap below)

The core idea is to construct the sequence to be sorted into a large top heap. At this point, the maximum value of the entire sequence is the root node of the heap. Swap it with the end element of the heap array, and then reconstruct the remaining n-1 sequences into a large top heap. Perform the previous operation repeatedly, and finally get an ordered sequence.

#!/usr/bin/env python #-*-coding:utf-8-*-# author:liu Jiang # python 3.5 # Heap Sort class Sqlist:def __init__ (self, Lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements to facilitate subsequent invocation.
    "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def heap_sort (self): length = Len (SELF.R)
    i = Int (LENGTH/2) # constructs the original sequence into a large top heap # traversal starts from the middle and ends at 0, which is actually the branch node of the heap.
    While I >= 0:self.heap_adjust (i, length-1) I-= 1 # reverse-traverse the entire sequence, constantly remove the root node value, complete the actual sort.
      j = length-1 While J > 0: # Converts the current root node, which is the list at the beginning, subscript 0, to the last J Self.swap (0, J) # to reconstruct the changed sequence into a large top heap Self.heap_adjust (0, j-1) J-= 1 def heap_adjust (self, S, m): "" "the core of the heap construction method, maintaining the sequence of the heap structure.
        "" "lis = SELF.R temp = lis[s] i = 2*s while I <= m:if i < m and lis[i] < lis[i+1]:  i + 1 if temp >= lis[i]: break lis[s] = Lis[i] s = i i *= 2 lis[s] = temp def
__str__ (self): ret = ""    For i in Self.r:ret + = "%s"% i return ret if __name__ = = ' __main__ ': sqlist = SqList ([4, 1, 7, 3, 8, 5

 , 9, 2, 6, 0, 123) Sqlist.heap_sort () print (sqlist)

The running time of heap sorting is mainly consumed by the iterative filtering of the initial build heap and the rebuild heap.

Its initial build heap has a time complexity of O (n).

When formally sorted, the time complexity of the rebuild heap is O (Nlogn).

So the overall time complexity of heap ordering is O (Nlogn).

Heap sorting is insensitive to the sort state of the original record, so it is O (NLOGN) regardless of the best, worst, and average time complexity. Performance is better than bubbling, simple selection, and direct insertion algorithms.

In space complexity, only one staging unit is needed for exchange. However, because the comparison and exchange of records are jumps, heap ordering is also an unstable sort method.

In addition, because of the more times of the initial construction heap, heap sorting is not suitable for the sorting work with less sequence number.

Vii. Merge Sort

Merge sort (Merging sort): An efficient sorting algorithm based on merging operations, which is a very typical application of the partition method (Divide and Conquer). The ordered Subsequence is merged to obtain a fully ordered sequence, that is, the sequence of each subsequence is ordered, and then the sequence between the subsequence segments is ordered. If the two ordered table is merged into an ordered table, it is called two-way merge.

#!/usr/bin/env python #-*-coding:utf-8-*-# author:liu Jiang # python 3.5 # Merge Sort class Sqlist:def __init__ (self, Lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements to facilitate subsequent invocation. "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def merge_sort (self): Self.msort (SELF.R, SE
    LF.R, 0, Len (SELF.R)-1 def msort (self, LIST_SR, list_tr, S, t): temp = [None for i in range (0, Len (LIST_SR))] if s = = T:list_tr[s] = list_sr[s] else:m = Int ((s+t)/2) Self.msort (LIST_SR, temp, S, m) self
    . Msort (List_sr, temp, m+1, T) self.merge (temp, list_tr, S, M, T) def merge (self, list_sr, list_tr, I, M, N):
        j = m+1 k = i while I <= M and J <= N:if List_sr[i] < List_sr[j]: list_tr[k] = list_sr[i] i + 1 else:list_tr[k] = List_sr[j] J = 1 K + + 1 if I <= m:for L in R Ange (0, m-i+1): list_tr[k+l] = list_sr[i+l] If J <= N:for l in range (0, n-j+1): list_tr[k+l] = list_sr[j+l] def __str__ (self): ret = "" For I in SE Lf.r:ret + = "%s"% i return ret if __name__ = ' __main__ ': sqlist = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0, 1

 2, Sqlist.merge_sort) print (sqlist)
    • The merge sort is insensitive to the distribution of original sequence elements, and its time complexity is O (NLOGN).
    • The merging sort needs to use some auxiliary space in the calculation process for recursion and storage result, so its space complexity is O (N+LOGN).
    • There are no jumps in the merge sort, only 22 comparisons, so it is a stable sort.

In short, merge ordering is a more efficient and stable algorithm for memory consumption.

Eight, quick sort

Quick sort, invented by Turing Award winner Tony Hoare, was listed as one of the top ten algorithms in the 20th century. An upgraded version of bubble sort, swapping sort. The time complexity for quick sorting is O (Nlog (n)).

The core idea of the fast sorting algorithm is to split the backlog into two separate parts by a sort of sequence, where the keywords are smaller than the keywords in the other part of the record, and then the two parts are sorted separately to achieve the sorting purpose of the entire collection of records.

#!/usr/bin/env python #-*-coding:utf-8-*-# author:liu Jiang # python 3.5 # Fast Sort class Sqlist:def __init__ (self, Lis=none): SELF.R = Lis def swap (self, I, j): "" Defines a method for exchanging elements to facilitate subsequent invocation. "" "temp = self.r[i] self.r[i] = self.r[j] self.r[j] = temp def quick_sort (self):" "" "" "" "" "self.q Sort (0, Len (SELF.R)-1) def qsort (self, Low, high): "" "Recursive Call" "" if low < High:pivot = Self.partition (lo W, High) self.qsort (Low, pivot-1) Self.qsort (pivot+1, kw) def partition (self, Low, high): "" "Fast Row
    The core code of the sequence.
    In fact, will be selected Pivot_key constantly exchanged, will be smaller than it to the left, will be larger than it to the right.
    It itself constantly transforms its position in the exchange until all the exchanges are complete.
    However, the value of the Pivot_key is never changed during the function call. :p Aram Low: The left edge subscript:p Aram High: Right edge subscript: return: After the left and the Pivot_key location of the subscript "" "lis = self.r Pivot_key = Lis[low  While low < high:while-< high and Lis[high] >= pivot_key:high-= 1 Self.swap (Low, High) while low < high and Lis[low] <= Pivot_key:
        Low + + 1 self.swap (low, high) return low def __str__ (self): ret = "' For I ' in SELF.R: RET + "%s"% i return ret if __name__ = ' __main__ ': sqlist = SqList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0, 123, S])

 Qlist.quick_sort () print (sqlist)
    • The time performance of a quick sort depends on the depth of the recursion.
    • When the Pivot_key is just in the middle of the record key code, the division between the two regions is more balanced, close to a balanced binary tree, at which time complexity is O (Nlog (n)).
    • When the original record set is a positive or reverse order, the result of the partition is a diagonal tree, its depth is n-1, each time the size partition is performed, the n-i comparison is used, and the final time complexity is O (n^2).
    • In general, it can be proved by mathematical induction that the time complexity of fast sorting is O (Nlog (n)).
    • But because keyword comparisons and swapping are jumps, fast sorting is an unstable sort.
    • At the same time, because of the recursive technique, the algorithm needs some auxiliary space, and its space complexity is O (logn).

The basic quick sort also has a place to optimize:

1. Optimized selection of Pivot_key

Each time we select Pivot_key is the first element of the subsequence, that is, Lis[low], which is more to see luck. When luck is good, the value is near the middle of the whole sequence, the tree is relatively balanced, the luck is poor, and the tree near the maximum or minimum position is constructed close to the oblique tree.

In order to ensure that the Pivot_key selection is as moderate as possible, take the value of the three special positions in the left middle right of the selection sequence, the number in the middle value is Pivot_key, which is usually better than using lis[low directly. In the code, add the following code before the original Pivot_key = Lis[low] Line:

m = low + int ((high-low)/2)
if Lis[low] > Lis[high]:
  self.swap (Low, high)
if lis[m] > Lis[high]:
  SE Lf.swap (High, m)
if lis[m] > Lis[low]:
  Self.swap (M, low)

If you think this is not good enough, you can also divide the whole sequence into 3 parts, each part of a pivot_key, and then 3 Pivot_key again to do the above comparison to get the final pivot_key. At this time the Pivot_key should be a lot of probability is a more reliable value.

2. Reduction of unnecessary exchange

The original code pivot_key this record is always in the exchange, in fact, this is not necessary, it can be temporarily in a temporary variable, as follows:

def partition (self, Low, high):
    
    lis = SELF.R

    m = low + int ((high-low)/2)
    if Lis[low] > Lis[high]:
      self. Swap (low, high)
    if lis[m] > Lis[high]:
      self.swap (High, m)
    if lis[m] > Lis[low]:
      Self.swap (M, low)

    Pivot_key = lis[low]
    # Temp Temporary pivot_key value
    temp = Pivot_key while low
    < high: while low
      < h  IgH and Lis[high] >= pivot_key: high-
        = 1
      # direct substitution without swapping
      lis[low] = Lis[high] While low
      < high and Lis[low] <= pivot_key: low
        + = 1
      Lis[high] = Lis[low]
      lis[low] = temp
    return low

3. Sorting when optimizing decimal groups

The recursive operation of the fast sorting algorithm can be accepted and fast in the process of data sorting. But when sorting the decimal group, it is not as quick as the direct insertion sort, which is overkill, not necessarily faster than a kitchen knife.

Therefore, a very simple approach is based on the number of data, do a choice of which algorithm to use, the following rewrite Qsort method:

def qsort (self, Low, high): "" "
  depending on the length of the sequence, choose whether to use the quick sort or simply insert the sort" "
  # 7 is an empirical value that can be determined by the actual situation.
  max_length = 7
  If high-low < max_length:
    if low < high:
      pivot = self.partition (low, high)
      self . qsort (Low, pivot-1)
      self.qsort (pivot + 1, high)
  else:
    # Insert_sort method is the simple insert sort algorithm
    we wrote earlier Self.insert_sort ()

4. Optimize recursive operation

The recursive operation of the whole algorithm can be optimized by the way of tail recursion, and the rewriting Qsort method is as follows:

def qsort (self, Low, high): "" "
  depending on the length of the sequence, choose whether to use the quick sort or simply insert the sort" "
  # 7 is an empirical value that can be determined by the actual situation.
  max_length = 7
  If high-low < max_length:
    # instead of while loop while low
    < high:
      pivot = self.partition ( Low, high)
      self.qsort (Low, pivot-1)
      # adopts a tail-recursive way of low
      = pivot + 1
  else:
    # Insert_ The sort method is the simple insert sort algorithm we wrote earlier
    Self.insert_sort ()

Summary of sorting algorithm

Classification of sorting algorithms:

There is no perfect algorithm, there will be a bit of disadvantage, even if the fast sorting algorithm, but also the overall performance of the superior, there is a sort of instability, the need for a large number of auxiliary space, not suitable for a small number of data sorting and other shortcomings.

Performance comparison of seven sorting algorithms

    • If the sequence is basically ordered, use a simple algorithm and do not use a complex improvement algorithm.
    • Merge sort and quick sort, although high performance, require more secondary space. In fact, it is to use space to change time.
    • The less the number of elements to be sorted, the more appropriate to use a simple sorting method, and the more elements the more appropriate to use the improved sorting algorithm.
    • The simple selection sort, while not good in time performance, has a high performance in space utilization. Especially suitable for the sort of elements whose data is not large, and the amount of information per piece is more.

The above is the entire content of this article, I hope to help you learn, but also hope that we support the cloud habitat community.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.