Seven classical sorting algorithms based on python (recommended) and seven types of python

Source: Internet
Author: User

Seven classical sorting algorithms based on python (recommended) and seven types of python

I. Basic concepts and classifications of sorting

Sorting is an operation that sorts a string of records in ascending or descending order based on the size of one or more keywords. Sorting algorithms are the methods for sorting records according to requirements.

Stability of sorting:

After sorting, if the sequence numbers of the two records are the same and the order of the two records remains unchanged in the original unordered records, the sorting method used is stable, and vice versa.

Internal and external sorting

In-memory sorting: All records to be sorted are stored in the memory.

External sorting: external storage is used during sorting.

We usually discuss inner sorting.

Three factors that affect the performance of the inner Sorting Algorithm:

  • Time Complexity: Time performance. An efficient sorting algorithm should have as few keyword comparisons and records as possible.
  • Space complexity: It mainly refers to the auxiliary space required for Algorithm Execution. The smaller the space, the better.
  • Algorithm complexity. It mainly refers to the complexity of the Code.

The internal sorting can be divided:

  • Insert sort
  • Exchange sorting
  • Select sort
  • Merge Sorting

The algorithm complexity can be divided into two types:

  • Simple algorithms: Includes Bubble sorting, simple selection sorting, and direct insertion sorting.
  • Improved Algorithms: including hill sorting, heap sorting, Merge Sorting, and quick sorting

The following seven sorting algorithms are only the most classic of all sorting algorithms, not all.

Ii. Bubble Sorting

Bubble sort: time complexity O (n ^ 2)

Exchange sorting. The core idea is to compare the key words of adjacent records in pairs. If the reverse order is used, the key words are exchanged until there is no reverse order record.

The implementation details can be different, for example, the following three types:

1. Simplest sorting implementation: bubble_sort_simple

2. Bubble Sorting: bubble_sort

3. Improved Bubble Sorting: bubble_sort_advance

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # Bubble Sorting Algorithm class SQList: def _ init _ (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def bubble_sort_simple (self): "simplest exchange sorting, time complexity O (n ^ 2)" lis = self. r length = len (self. r) for I in range (length): for j in range (I + 1, length): if lis [I]> lis [j]: self. swap (I, j) def bubble_sort (self): "Bubble sorting, time complexity O (n ^ 2)" lis = self. r length = len (self. r) for I in range (length): j = length-2 while j> = I: if lis [j]> lis [j + 1]: self. Swap (j, j + 1) j-= 1 def bubble_sort_advance (self): "bubble sort improved algorithm, time complexity O (n ^ 2) Set flag, when the exchange action is not performed in a round of comparison, it means that the following elements are arranged in sequence. For relatively regular element sets, the sorting efficiency can be improved. "" Lis = self. r length = len (self. r) flag = True I = 0 while I <length and flag: flag = False j = length-2 while j> = I: if lis [j]> lis [j + 1]: self. swap (j, j + 1) flag = True j-= 1 I + = 1 def _ str _ (self): ret = "" for I in self. r: ret + = "% s" % I return retif _ name _ = '_ main _': sqlist = SQList, 9, 2, 6]) # sqlist. bubble_sort_simple () # sqlist. bubble_sort () sqlist. bubble_sort_advance () print (sqlist)

Iii. Simple selection and sorting

Simple selection sort: time complexity O (n ^ 2)

Through the comparison between n-I keywords, the records with the minimum keyword are selected from n-I + 1 records, and the records with the I (1 <= I <= n) records are exchanged.

In general, for all elements that have not yet been sorted, the subscript of the smallest element is recorded, that is, the position of the element. Then, the element is switched to the beginning of the current traversal. The efficiency is that each round is compared many times, but only once. Therefore, although its time complexity is O (n ^ 2), it is better than the bubble algorithm.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # simple sorting class SQList: def _ init _ (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def select_sort (self): "" Simple selection and sorting, time complexity O (n ^ 2) "lis = self. r length = len (self. r) for I in range (length): minimum = I for j in range (I + 1, length): if lis [minimum]> lis [j]: minimum = j if I! = Minimum: self. swap (I, minimum) def _ str _ (self): ret = "" for I in self. r: ret + = "% s" % I return retif _ name _ = '_ main _': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0]) sqlist. select_sort () print (sqlist)

Iv. Direct insertion sorting

Direct insert sorting (Straight Insertion Sort): time complexity O (n ^ 2)

The basic operation is to insert a record to an ordered table that has already been sorted to obtain a new ordered table with an increase of 1 in the number of records.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # Insert the sorting class SQList directly: def _ init _ (self, lis = None): self. r = lis def insert_sort (self): lis = self. r length = len (self. r) # subscript starting from 1 for I in range (1, length): if lis [I] <lis [I-1]: temp = lis [I] j = I-1 while lis [j]> temp and j> = 0: lis [j + 1] = lis [j] j-= 1 lis [j + 1] = temp def _ str _ (self ): ret = "" for I in self. r: ret + = "% s" % I return retif _ name _ = '_ main _': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0]) sqlist. insert_sort () print (sqlist)

This algorithm requires a record auxiliary space. In the best case, when the raw data is ordered, only one round of comparison is required, and no records need to be moved. The time complexity is O (n ). However, this is basically a fantasy.

V. Hill sorting

Shell Sort is an improved version of insert sorting. Its core idea is to divide the original data set into several sub-sequences, and then directly insert and Sort the sub-sequences, the subsequence is basically ordered, and then all records are directly inserted and sorted.

The most important thing here is the Skip and split strategies, that is, how we split data and how much interval we had. Generally, records that are separated by an increment are made up of a subsequence, so as to ensure that the results obtained after direct insertion and sorting in the subsequence are basically ordered rather than partially ordered. The following example uses increment = int (increment/3) + 1 to determine the "increment" value.

The time complexity of hill sorting is: O (n ^ (3/2 ))

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # Hill sorting class SQList: def _ init _ (self, lis = None): self. r = lis def shell_sort (self): "Hill sort" "lis = self. r length = len (lis) increment = len (lis) while increment> 1: increment = int (increment/3) + 1 for I in range (increment + 1, length ): if lis [I] <lis [I-increment]: temp = lis [I] j = I-increment while j> = 0 and temp <lis [j]: lis [j + increment] = lis [j] j-= increment lis [j + increment] = temp def _ str _ (self ): ret = "" for I in self. r: ret + = "% s" % I return retif _ name _ = '_ main _': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0,123, 22]) sqlist. shell_sort () print (sqlist)

6. Heap sorting

A heap is a complete binary tree with the following properties:

The value of each branch node is greater than or equal to the value of its left and right children;

The value of each branch node is smaller than or equal to the value of the right child;

Therefore, the root node must be the maximum (minimum) value of all nodes.

If the number of a node starts from 1 according to the sequence Traversal method (breadth first), the relationship between the nodes is as follows:

Heap Sort is a method of sorting by the nature of a large or small top Heap. The overall time complexity of heap sorting is O (nlogn ). (The following uses a large heap method)

The core idea is to construct the sequence to be sorted into a large heap. The maximum value of the entire sequence is the heap root node. Swap it with the end element of the heap array, and then reconstruct the remainder n-1 sequences into a large top heap. Repeat the previous operation to obtain an ordered sequence.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # heap sorting class SQList: def _ init _ (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def heap_sort (self): length = len (self. r) I = int (length/2) # construct the original sequence into a large top heap # traverse starts from the middle and ends at 0. In fact, these are the branch nodes of the heap. While I> = 0: self. heap_adjust (I, length-1) I-= 1 # traverse the entire sequence in reverse order, constantly retrieve the value of the root node, and complete the actual sorting. J = length-1 while j> 0: # Switch the value of the current root node, that is, the value starting with the list and subscript 0, to self at the end of j. swap (0, j) # reconstruct the changed sequence into a large top stack self. heap_adjust (0, J-1) j-= 1 def heap_adjust (self, s, m): "" the core of the Big Top heap constructor method to maintain the Heap Structure of the sequence. "" Lis = self. r temp = lis [s] I = 2 * s while I <= m: if I <m and lis [I] <lis [I + 1]: I + = 1 if temp> = lis [I]: break lis [s] = lis [I] s = I * = 2 lis [s] = temp def _ str _ (self ): ret = "" for I in self. r: ret + = "% s" % I return retif _ name _ = '_ main _': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0,123, 22]) sqlist. heap_sort () print (sqlist)

The running time of heap sorting is mainly consumed by the repeated screening of the initial build heap and reconstruction heap.

The time complexity of the initial build heap is O (n ).

During the formal sorting, the time complexity of rebuilding the heap is O (nlogn ).

Therefore, the overall time complexity of heap sorting is O (nlogn ).

Heap sorting is not sensitive to the sorting status of original records, so it is O (nlogn) regardless of the best, worst, and average time complexity ). The performance is better than bubble, simple selection, and direct insertion algorithms.

In terms of space complexity, only one temporary storage unit is required for swap. However, because record comparison and exchange are leaping, heap sorting is also an unstable sorting method.

In addition, because the number of initial heap building comparisons is large, heap sorting is not suitable for sorting tasks with a small number of sequences.

VII. Merge Sorting

Merging Sort: An Effective Sorting Algorithm Based on merge operations. This algorithm is a typical application of Divide and Conquer. Merges ordered subsequences to obtain a fully ordered sequence. That is, first orders each subsequence, and then orders the subsequence segments. If two ordered tables are merged into an ordered table, it is called a two-way merge.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # merge and sort class SQList: def _ init _ (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def merge_sort (self): self. msort (self. r, self. r, 0, len (self. r)-1) def msort (self, list_sr, list_tr, s, t): temp = [None for I in range (0, len (list_sr)] if s = t: list_tr [s] = list_sr [s] else: m = int (s + t)/2) self. msort (list_sr, temp, s, m) self. msort (list_sr, temp, m + 1, t) self. merge (temp, list_tr, s, m, t) def merge (self, list_sr, list_tr, I, m, n ): j = m + 1 k = I while I <= m and j <= n: if list_sr [I] <list_sr [j]: list_tr [k] = list_sr [I] I + = 1 else: list_tr [k] = list_sr [j] j + = 1 k + = 1 if I <= m: for l in range (0, m-I + 1): list_tr [k + l] = list_sr [I + l] if j <= n: for l in range (0, n-j + 1): list_tr [k + l] = list_sr [j + l] def _ str _ (self): ret = "" for I in self. r: ret + = "% s" % I return retif _ name _ = '_ main _': sqlist = SQList ([4, 1, 7, 3, 8, 5, 9, 2, 6, 0, 12, 77, 34, 23]) sqlist. merge_sort () print (sqlist)
  • Merge Sorting is not sensitive to the distribution of elements in the original sequence, and its time complexity is O (nlogn ).
  • Merging and sorting requires a certain amount of auxiliary space for recursion and result storage. Therefore, the space complexity is O (n + logn ).
  • There is no skip in Merge Sorting, and there is only a comparison between two, so it is a stable sorting.

In short, Merge Sorting is a relatively memory-consuming algorithm with high efficiency and stability.

8. Quick sorting

Quick Sort was invented by Tony Hoare, winner of the Turing Award, and is listed as one of the top 10 algorithms in the 20th century. An upgraded version of the bubble sort. The time complexity of quick sorting is O (nlog (n )).

The core idea of the quick sorting algorithm: splits the records to be sorted into two separate parts by one sort. the keywords of some records are smaller than those of other records, then sort the two parts separately to sort the entire set of records.

#! /Usr/bin/env python #-*-coding: UTF-8-*-# Author: Liu Jiang # Python 3.5 # Quick sorting class SQList: def _ init _ (self, lis = None): self. r = lis def swap (self, I, j): "" defines a method for exchanging elements, which is convenient to be called later. "" Temp = self. r [I] self. r [I] = self. r [j] self. r [j] = temp def quick_sort (self): "" Call entry "self. qsort (0, len (self. r)-1) def qsort (self, low, high): "recursive call" if low 

 
  • The time performance of quick sorting depends on the depth of recursion.
  • When the pivot_key happens to be in the middle of the record key code, the division of the two regions is balanced, close to a balanced binary tree, and the time complexity is O (nlog (n )).
  • When the original record set is in the forward or backward order, the partition result is an oblique tree with the depth of N-1, and each execution of the size partition, the end time complexity of the n-I comparison is O (n ^ 2 ).
  • In general, it can be proved by mathematical induction that the time complexity of fast sorting is O (nlog (n )).
  • However, due to the comparison and exchange of keywords, fast sorting is an unstable sorting.
  • At the same time, due to the recursive technology used, this algorithm requires a certain amount of auxiliary space, and its spatial complexity is O (logn ).

The basic quick sorting can also be optimized:

1. Optimize the selected shortt_key

Each time we select the pivot_key, the first element of the subsequence is lis [low]. This is luck. When luck is good, the value is near the center of the entire sequence, and the constructed tree is balanced with poor luck. If it is near the maximum or minimum position, the constructed tree is close to the oblique tree.

In order to ensure that the selection of pivot_key is as moderate as possible, the number of values in the middle of the selected sequence in the three special locations on the left is usually faster than the direct use of lis [low. In the code, add the following code before the original pipeline t_key = lis [low] line:

m = low + int((high-low)/2)if lis[low] > lis[high]:  self.swap(low, high)if lis[m] > lis[high]:  self.swap(high, m)if lis[m] > lis[low]:  self.swap(m, low)

If you think this is not good enough, you can divide the entire sequence into three parts, each part is used to obtain a pivot_key, and then compare the three effect_keys to obtain the final comput_key. At this time, the effect_key should have a high probability that it is a relatively reliable value.

2. Reduce Unnecessary exchanges

In the original code, the record t_key is always exchanged. In fact, this is not necessary. It can be saved to a temporary variable, as shown below:

Def partition (self, low, high): lis = self. r m = low + int (high-low)/2) if lis [low]> lis [high]: self. swap (low, high) if lis [m]> lis [high]: self. swap (high, m) if lis [m]> lis [low]: self. swap (m, low) Export t_key = lis [low] # temp temporarily stores the temporary t_key value temp = export t_key while low 

3. Optimize the sorting of small arrays

The recursive operation of the quick sorting algorithm can accept the overhead and speed when sorting a large amount of data. However, when sorting small arrays, it is better to insert them directly to sort the orders faster. That is to say, it is not necessarily faster than a kitchen knife to kill chickens.

Therefore, a simple method is to select which algorithm to use based on the amount of data. Rewrite the qsort method as follows:

Def qsort (self, low, high): "based on the sequence length, select" Fast sort "or" simple insert sort "" #7 is an experience value, you can determine the value based on the actual situation. MAX_LENGTH = 7 if high-low <MAX_LENGTH: if low 

4. Optimize recursive operations

The tail recursion method can be used to optimize the recursive operation of the entire algorithm. The method for rewriting qsort is as follows:

Def qsort (self, low, high): "based on the sequence length, select" Fast sort "or" simple insert sort "" #7 is an experience value, you can determine the value based on the actual situation. MAX_LENGTH = 7 if high-low <MAX_LENGTH: # Use the while LOOP while low 

9. Sorting Algorithm Summary

Classification of sorting algorithms:

 

Without perfect algorithms, there will be some disadvantages. Even a fast sorting algorithm is superior to the overall performance, and it also has unstable sorting, which requires a lot of auxiliary space, it is not suitable for sorting a small amount of data.

Performance Comparison of seven sorting algorithms

  • If the columns to be sorted are basically ordered, use simple algorithms instead of complex improvement algorithms.
  • Although the performance of Merge Sorting and quick sorting is high, more auxiliary space is required. In fact, it is to use space for time.
  • The fewer elements in the columns to be sorted, the more suitable the simple sorting method is. The more elements, the more suitable the improved sorting algorithm is.
  • Although the time performance of simple sorting is poor, it has a high performance in space utilization. It is particularly suitable for sorting of a type of elements with a small amount of data and a large amount of information per piece of data.

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.