Python-based Seven classic sorting algorithms

Source: Internet
Author: User
1. The basic concept of sorting and the so-called sorting of classification are operations that sort a string of records in ascending or descending order according to the size of one or some of the keywords. Sorting algorithms are the methods for sorting records according to requirements. Sorting stability: after sorting, if the sequence numbers of the two records are the same and the Order of the two records remains unchanged in the original unordered records, the sorting method used is I. Basic Concepts and Classifications of sorting

Sorting is an operation that sorts a string of records in ascending or descending order based on the size of one or more keywords. Sorting algorithms are the methods for sorting records according to requirements.

Stability of sorting:

After sorting, if the sequence numbers of the two records are the same and the Order of the two records remains unchanged in the original unordered records, the sorting method used is stable, and vice versa.

Internal and external sorting

In-memory sorting: all records to be sorted are stored in the memory.

External Sorting: external storage is used during sorting.

We usually discuss inner sorting.

Three factors that affect the performance of the inner sorting algorithm:

Time complexity: time performance. an efficient sorting algorithm should have as few keyword comparisons and records as possible.

Space complexity: it mainly refers to the auxiliary space required for algorithm execution. the smaller the space, the better.

Algorithm complexity. It mainly refers to the complexity of the code.

The internal sorting can be divided:

Insert sort

Exchange sorting

Select sort

Merge sorting

The algorithm complexity can be divided into two types:

Simple algorithms: includes bubble sorting, simple selection sorting, and direct insertion sorting.

Improved Algorithms: including Hill sorting, heap sorting, merge sorting, and quick sorting

The following seven sorting algorithms are only the most classic of all sorting algorithms, not all.

II. Bubble sorting

Bubble sort: time complexity O (n ^ 2)

Exchange sorting. The core idea is to compare the key words of adjacent records in pairs. if the reverse order is used, the key words are exchanged until there is no reverse Order record.

The implementation details can be different, for example, the following three types:

1. simplest sorting implementation: bubble_sort_simple

2. Bubble sorting: bubble_sort

3. improved bubble sorting: bubble_sort_advance

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # bubble sorting algorithm class SQList: def init (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def bubble_sort_simple (self): "simplest exchange sorting, time complexity O (n ^ 2)" lis = self. r length = len (self. r) for I in range (length): for j in range (I + 1, length): if lis [I]> lis [j]: self. swap (I, j) def bubble_sort (self): "bubble sorting, time complexity O (n ^ 2)" lis = self. r length = len (self. r) for I in range (length): j = length-2 while j> = I: if lis [j]> lis [j + 1]: self. Swap (j, j + 1) j-= 1 def bubble_sort_advance (self): "bubble sort improved algorithm, time complexity O (n ^ 2) set flag, when the exchange action is not performed in a round of comparison, it means that the following elements are arranged in sequence. For relatively regular element sets, the sorting efficiency can be improved. "" Lis = self. r length = len (self. r) flag = True I = 0 while I <length and flag: flag = False j = length-2 while j> = I: if lis [j]> lis [j + 1]: self. swap (j, j + 1) flag = True j-= 1 I + = 1 def str (self): ret = "" for I in self. r: ret + = "% s" % I return retif name = 'main': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6]) # sqlist. bubble_sort_simple () # sqlist. bubble_sort () sqlist. bubble_sort_advance () print (sqlist)



III. simple selection and sorting

Simple selection sort: time complexity O (n ^ 2)

Through the comparison between n-I keywords, the records with the minimum keyword are selected from n-I + 1 records, and the records with the I (1 <= I <= n) records are exchanged.

In general, for all elements that have not yet been sorted, the subscript of the smallest element is recorded, that is, the position of the element. Then, the element is switched to the beginning of the current traversal. The efficiency is that each round is compared many times, but only once. Therefore, although its time complexity is O (n ^ 2), it is better than the bubble algorithm.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # Simple sorting class SQList: def init (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def select_sort (self): "" simple selection and sorting, time complexity O (n ^ 2) "lis = self. r length = len (self. r) for I in range (length): minimum = I for j in range (I + 1, length): if lis [minimum]> lis [j]: minimum = j if I! = Minimum: self. swap (I, minimum) def str (self): ret = "" for I in self. r: ret + = "% s" % I return retif name = 'main': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0]) sqlist. select_sort () print (sqlist)



IV. direct insertion sorting

Direct insert sorting (Straight Insertion Sort): time complexity O (n ^ 2)

The basic operation is to insert a record to an ordered table that has already been sorted to obtain a new ordered table with an increase of 1 in the number of records.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # insert sort class SQList: def init (self, lis = None): self. r = lis def insert_sort (self): lis = self. r length = len (self. r) # subscript starting from 1 for I in range (1, length): if lis [I] <lis [i-1]: temp = lis [I] j = i-1 while lis [j]> temp and j> = 0: lis [j + 1] = lis [j] j-= 1 lis [j + 1] = temp def str (self): ret = "" for I in self. r: ret + = "% s" % I return retif name = 'main': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0]) sqlist. insert_sort () print (sqlist)



This algorithm requires a record auxiliary space. In the best case, when the raw data is ordered, only one round of comparison is required, and no records need to be moved. The time complexity is O (n ). However, this is basically a fantasy.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # Hill sorting class SQList: def init (self, lis = None): self. r = lis def shell_sort (self): "Hill sort" "lis = self. r length = len (lis) increment = len (lis) while increment> 1: increment = int (increment/3) + 1 for I in range (increment + 1, length ): if lis [I] <lis [I-increment]: temp = lis [I] j = I-increment while j> = 0 and temp <lis [j]: lis [j + increment] = lis [j] j-= increment lis [j + increment] = temp def str (self): ret = "" for I in self. r: ret + = "% s" % I return retif name = 'main': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0,123, 22]) sqlist. shell_sort () print (sqlist)



6. heap sorting

A heap is a complete binary tree with the following properties:

The value of each branch node is greater than or equal to the value of its left and right children;

The value of each branch node is smaller than or equal to the value of the right child;

Therefore, the root node must be the maximum (minimum) value of all nodes.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # heap sorting class SQList: def init (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def heap_sort (self): length = len (self. r) I = int (length/2) # construct the original sequence into a large top heap # traverse starts from the middle and ends at 0. In fact, these are the branch nodes of the heap. While I> = 0: self. heap_adjust (I, length-1) I-= 1 # traverse the entire sequence in reverse order, constantly retrieve the value of the root node, and complete the actual sorting. J = length-1 while j> 0: # switch the value of the current root node, that is, the value starting with the list and subscript 0, to self at the end of j. swap (0, j) # reconstruct the changed sequence into a large top stack self. heap_adjust (0, j-1) j-= 1 def heap_adjust (self, s, m): "" The core of the big top heap constructor method to maintain the heap structure of the sequence. "" Lis = self. r temp = lis [s] I = 2 * s while I <= m: if I <m and lis [I] <lis [I + 1]: I + = 1 if temp> = lis [I]: break lis [s] = lis [I] s = I * = 2 lis [s] = temp def str (self): ret = "" for I in self. r: ret + = "% s" % I return retif name = 'main': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0,123, 22]) sqlist. heap_sort () print (sqlist)


The running time of heap sorting is mainly consumed by the repeated screening of the initial build heap and reconstruction heap.

The time complexity of the initial build heap is O (n ).

During the formal sorting, the time complexity of rebuilding the heap is O (nlogn ).

Therefore, the overall time complexity of heap sorting is O (nlogn ).

Heap sorting is not sensitive to the sorting status of original records, so it is O (nlogn) regardless of the best, worst, and average time complexity ). The performance is better than bubble, simple selection, and direct insertion algorithms.

In terms of space complexity, only one temporary storage unit is required for swap. However, because record comparison and exchange are leaping, heap sorting is also an unstable sorting method.

In addition, because the number of initial heap building comparisons is large, heap sorting is not suitable for sorting tasks with a small number of sequences.

VII. merge sorting

Merging Sort: an effective sorting algorithm based on the merge operation. this algorithm is a typical application of pide and Conquer. Merges ordered subsequences to obtain a fully ordered sequence. that is, first orders each subsequence, and then orders the subsequence segments. If two ordered tables are merged into an ordered table, it is called a two-way merge.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # Merge sort class SQList: def init (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def merge_sort (self): self. msort (self. r, self. r, 0, len (self. r)-1) def msort (self, list_sr, list_tr, s, t): temp = [None for I in range (0, len (list_sr)] if s = t: list_tr [s] = list_sr [s] else: m = int (s + t)/2) self. msort (list_sr, temp, s, m) self. msort (list_sr, temp, m + 1, t) self. merge (temp, list_tr, s, m, t) def merge (self, list_sr, list_tr, I, m, n ): j = m + 1 k = I while I <= m and j <= n: if list_sr [I] <list_sr [j]: list_tr [k] = list_sr [I] I + = 1 else: list_tr [k] = list_sr [j] j + = 1 k + = 1 if I <= m: for l in range (0, m-I + 1): list_tr [k + l] = list_sr [I + l] if j <= n: for l in range (0, n-j + 1): list_tr [k + l] = list_sr [j + l] def str (self): ret = "" for I in self. r: ret + = "% s" % I return retif name = 'main': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0, 12, 77, 34, 23]) sqlist. merge_sort () print (sqlist)



Merge sorting is not sensitive to the distribution of elements in the original sequence, and its time complexity is O (nlogn ).

Merging and sorting requires a certain amount of auxiliary space for recursion and result storage. Therefore, the space complexity is O (n + logn ).

There is no skip in merge sorting, and there is only a comparison between two, so it is a stable sorting.

In short, merge sorting is a relatively memory-consuming algorithm with high efficiency and stability.

8. quick sorting

Quick Sort was invented by Tony Hoare, winner of the Turing Award, and is listed as one of the top 10 algorithms in the 20th century. An upgraded version of the bubble sort. The time complexity of quick sorting is O (nlog (n )).

The core idea of the quick sorting algorithm: splits the records to be sorted into two separate parts by one sort. the keywords of some records are smaller than those of other records, then sort the two parts separately to sort the entire set of records.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # Fast sorting class SQList: def init (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def quick_sort (self): "" Call entry "self. qsort (0, len (self. r)-1) def qsort (self, low, high): "recursive call" if low 


The time performance of quick sorting depends on the depth of recursion.

When the pivot_key happens to be in the middle of the record key code, the division of the two regions is balanced, close to a balanced binary tree, and the time complexity is O (nlog (n )).

When the original record set is in the forward or backward order, the partition result is an oblique tree with the depth of n-1, and each execution of the size partition, the end time complexity of the n-I comparison is O (n ^ 2 ).

In general, it can be proved by mathematical induction that the time complexity of fast sorting is O (nlog (n )).

However, due to the comparison and exchange of keywords, fast sorting is an unstable sorting.

At the same time, due to the recursive technology used, this algorithm requires a certain amount of auxiliary space, and its spatial complexity is O (logn ).

The basic quick sorting can also be optimized:

1. optimize the selected shortt_key

Each time we select the pivot_key, the first element of the subsequence is lis [low]. this is luck. When luck is good, the value is near the center of the entire sequence, and the constructed tree is balanced with poor luck. if it is near the maximum or minimum position, the constructed tree is close to the oblique tree.

In order to ensure that the selection of pivot_key is as moderate as possible, the number of values in the middle of the selected sequence in the three special locations on the left is usually faster than the direct use of lis [low. In the code, add the following code before the original pipeline t_key = lis [low] line:

m = low + int((high-low)/2)if lis[low] > lis[high]:  self.swap(low, high)if lis[m] > lis[high]:  self.swap(high, m)if lis[m] > lis[low]:  self.swap(m, low)



If you think this is not good enough, you can divide the entire sequence into three parts, each part is used to obtain a pivot_key, and then compare the three effect_keys to obtain the final comput_key. At this time, the effect_key should have a high probability that it is a relatively reliable value.

2. reduce unnecessary exchanges

In the original code, the record t_key is always exchanged. In fact, this is not necessary. it can be saved to a temporary variable, as shown below:

Def partition (self, low, high): lis = self. r m = low + int (high-low)/2) if lis [low]> lis [high]: self. swap (low, high) if lis [m]> lis [high]: self. swap (high, m) if lis [m]> lis [low]: self. swap (m, low) export t_key = lis [low] # temp temporarily stores the temporary t_key value temp = export t_key while low 



3. optimize the sorting of small arrays

The recursive operation of the quick sorting algorithm can accept the overhead and speed when sorting a large amount of data. However, when sorting small arrays, it is better to insert them directly to sort the orders faster. that is to say, it is not necessarily faster than a kitchen knife to kill chickens.

Therefore, a simple method is to select which algorithm to use based on the amount of data. rewrite the qsort method as follows:

Def qsort (self, low, high): "based on the sequence length, select" fast sort "or" simple insert sort "" #7 is an experience value, you can determine the value based on the actual situation. MAX_LENGTH = 7 if high-low <MAX_LENGTH: if low 


4. optimize recursive operations

The tail recursion method can be used to optimize the recursive operation of the entire algorithm. the method for rewriting qsort is as follows:

Def qsort (self, low, high): "based on the sequence length, select" fast sort "or" simple insert sort "" #7 is an experience value, you can determine the value based on the actual situation. MAX_LENGTH = 7 if high-low <MAX_LENGTH: # Use the while loop while low 



9. sorting algorithm summary

Classification of sorting algorithms:

Although the time performance of simple sorting is poor, it has a high performance in space utilization. It is particularly suitable for sorting of a type of elements with a small amount of data and a large amount of information per piece of data.

The above is a detailed description of the seven classic sorting algorithms based on python. For more information, see other related articles in the first PHP community!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.