Summary of nine sorting algorithms

Last Update:2015-02-07 Source: Internet

Author: User

Tags sorts terminates

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article is http://blog.csdn.net/xiazdong/article/details/7304239 Supplement, looked at the "Big Talk Data Structure" summary, but now read the "Introduction to the algorithm", found that the previous understanding of the sequencing is not deep, So I'm going to go over all sorts of ideas again. This paper first introduces the sorting algorithm based on the comparison model, that is, the worst complexity is in the Ω (NLGN) sorting algorithm, and then introduces some linear time sorting algorithms, although all in linear time, but all in the input array has certain constraints on the premise of the line. This article is summarized in chapter 2nd, 3, 4, 6, 7 and 8 of the introduction to algorithms. The origin of the algorithm: 9th century Persian mathematician proposed: "Al-khowarizmi" Sort definition: Input: N number: A1,a2,a3,..., an output: N number of permutations: A1 ', A2 ', A3 ',..., an ', making A1 ' <=a2 ' <= A3 ' <=...<=an '. In-place sort (no extra memory or constant memory): Insert sort, select sort, bubble sort, heap sort, quick sort. Out-place Sort: Merge sort, count sort, base sort, bucket sort. When a large amount of data needs to be sorted, In-place sort shows the advantage because only the constant memory is required. Imagine that if you were to sort 10,000 data, if you were using out-place sort, then assuming that you needed 200G of extra space, an old computer would be a bit of a legacy, but if you use in-place sort, you don't need to spend extra memory. Stable sort: Insert sort, bubble sort, merge sort, count sort, base sort, bucket sort. Unstable sort: Select Sort (5 8 5 2 9), quick sort, heap sort. Why is the stability of the order important? Do you think stability is so important in the beginner sort? is the order of the two same elements so important? It's really important. It is particularly prominent in the cardinal order, as follows: Introduction to Algorithms exercise 8.3-2said: If the instability of the algorithm to improve, so that the instability of the algorithm is stable? In fact, it is simple to add an index to each INPUT element, which represents the initial array index, and when the unstable algorithm is ordered, the index is sorted for the same element. The comparison-based sequencing follows the decision tree model, and in the decision tree model, we can prove that the ranking algorithm given to compare the worst-case run time is Ω (NLGN), the proof is that the n sequence of decision tree has at least n! the number of leaf nodes, so the height is at least nlgn. Linear time sequencing can ideally be ordered in linear time, but each sort requires some assumptions about the input array, such as a count order that requires an input array number range of [0,k], and so on. The "cyclic invariant" is introduced in the correctness proof of the sorting algorithm, he is similar to the mathematical induction method, "initial" corresponds to "n=1", "Keep" the corresponding "hypothesis n=k is established when N=k+1". First, insert sortFeatures: Stable sort, in-place sort optimal complexity: When an input array is ordered, the complexity is O (n), and the fast sort results in the complexity of O (n^2) in this case. Worst-case complexity: When the input array is in reverse order, the complexity of O (n^2) insertion sort is better suited for "arrays of small numbers of elements". In fact, the complexity of the insertion sort is the same as the number of reverse pairs, when the array is reversed, the number of reverse pairs is n (n-1)/2, so the insertion sorting complexity is O (n^2). In Introduction to algorithm 2-4, there is an introduction to reverse order. Pseudo code:

prove the correctness of the algorithm: Loop invariant: a[1...i-1] contains elements of the original a[1...i-1] and is sorted before each cycle begins. Initial: i=2,a[1...1] is sorted, established. Hold: Before the iteration begins, A[1...i-1] is sorted, and the purpose of the loop body is to insert a[i] into a[1...i-1], so that a[1...i] is sorted so that the next iteration begins, i++, so now A[1...i-1] is ordered, so the loop invariant is maintained. Termination: The last i=n+1, and A[1...N] is sorted, and A[1...N] is the entire array, so the certificate is completed. in the introduction of the algorithm 2.3-6 also asked whether the pseudo-code 第6-8 line can be implemented by the dichotomy method? Actually, it's not. Because the 第6-8 line is not a purely linear lookup, but also to move out of a vacancy let A[i] insert, so even if the binary lookup with O (LGN) to find the insertion position, but still to use O (n) time to move out of an empty space. Q: Is fast sorting (not using randomization) necessarily faster than insert sorting? A: Not necessarily, when the input array is already sorted, the insertion sort requires O (n) time, and the fast sort requires O (n^2) time. Recursive version insert sort two, bubble sort features: stable sort, in-place sort idea: Through 22 exchange, like the bubbles in the water, the small first come out, big after the come out. Worst run Time: O (n^2) Best run Time: O (n^2) (of course, can also be improved to make optimal run time for O (n)) algorithm Introduction Study questions 2-2 describes the bubbling sort. Pseudo code: prove the correctness of the algorithm: Using two cycles invariant, the inner circulation of the 4–6 line is proved, and then the outer circulation is proved. Inner loop invariant: a[j] is the smallest element in A[J...N] before each cycle begins. Initial: J=n, so a[n] is the smallest element of A[N...N]. Hold: When the loop starts, it is known that a[j] is the smallest element of A[J...N], compares a[j] to a[j-1], and places the smaller in the j-1 position, so that the smallest element of a[j-1] is explained, so the loop is persisted. Termination: j=i, known as A[i] is the smallest element in A[I...N] and is completed. Next proves the outer loop invariant: Before each loop, A[1...i-1] contains the smallest i-1 element in a and is sorted: a[1]<=a[2]<=...<=a[i-1]. Initial: I=1, so a[1..0]= empty, therefore established. Hold: When the loop starts, the known a[1...i-1] is the smallest i-1 element in a, and a[1]<=a[2]<=...<=a[i-1], according to the inner Loop invariant, terminates A[i] is the smallest element in A[I...N, so a[1...i ] contains the smallest I element of a, and A[1]<=a[2]<=...<=a[i-1]<=a[i] terminates: i=n+1, known A[1...N] is the smallest n element in a, and a[1]<=a[2]<= ... <=a[n], proof. in the introduction of the algorithm study questions 2-2 also asked "bubble sort and insert sort which is faster"? The average person answers: "Almost, because the asymptotic time is O (n^2)". In fact, however, the speed of the insertion sort is directly the number of reverse pairs, and the number of "swaps" in the bubbling sort is the number of reverse pairs, so the time to execute the bubble sort is at least the number of reverse pairs, so the execution time of the insertion sort is at least faster than the bubbling sort. recursive version bubble sort Improved bubble sortingBest run Time: O (n) worst run time: O (n^2) Three, select sorting characteristics: In-place sort,unstable sort. Thought: Find a minimum value each time. Best case Time: O (n^2). Worst case time: O (n^2). Pseudo code: prove the correctness of the algorithm: Cyclic invariant type: A[1...i-1] contains the smallest i-1 element in a and is sorted. Initial: i=1,a[1...0]= empty, so set up. Keep: Before the beginning of an iteration, keep the loop invariant, that is, a[1...i-1] contains the smallest i-1 element of a, and sorted, then enter the loop body, the program from A[I...N] to find the minimum value placed at A[i], so a[1...i] contains the smallest I element of a, and sorted , and i++, so before the next loop, keep the loop invariant: a[1..i-1] contains the smallest i-1 element in a and is sorted. termination: I=n, known a[1...n-1] contains the smallest i-1 element in a, and is ordered, so the element in A[n] is the largest, so A[1...N] is sorted and certified. Introduction to the algorithm 2.2-2 asked "Why is the 3rd line in pseudocode only n-1 times instead of n"? It is also mentioned in the cyclic invariant proof that if a[1...n-1] is sorted and contains the smallest n-1 element in a, then a[n] must be the largest and therefore definitely ordered. recursive version selection sorting Recursive type: T (n) =t (n-1) +o (n) + T (n) =o (n^2) Four, merge sorting features: stable sort, out-place sort idea: To solve the problem of sorting by the idea of divide and conquer. Worst-case uptime: O (NLGN) Best run Time: O (NLGN) Introduction: Divide-and-conquer method is to decompose the original problem into a number of independent sub-problems, and the form of these sub-problems similar to the original problem, but the size of the reduction, after the completion of the sub-problem, the merger result constitutes the original problem solution. There are 3 steps to divide and Conquer: Divide (Steps to decompose sub-problems), conquer (steps to solve sub-problems recursively), Combine (the steps of merging the solutions of sub-problems into original problems). Assuming that divide requires F (n) time, conquer is decomposed to B sub-problem, and the sub-problem size is a,combine requires g (n) time, then the recursion is: T (n) =BT (n/a) +f (n) +g (n) Introduction to Algorithms study Questions 4-3 (parametric pass)To be able to examine the understanding of the Division and the law very well. Just like the merge sort, the divide steps are m= (P+Q)/2, so O (1), the Combine step is the merge () function, the conquer step is decomposed into 2 sub-problems, the sub-problem size is N/2, so: The recursive type of merge sort: T (n) =2t (N/2) +o (n) The three methods for solving recursion are: (1) Substitution: It is mainly used to verify the complexity of the recursive type. (2) Recursive tree: Can roughly estimate the complexity of recursion, and can be verified by substitution method after estimating. (3) The main theorem: used to solve some common recursive types. Pseudo code: prove the correctness of the algorithm: In fact, we just need to prove the correctness of the merge () function. The main step of the merge function is in line 25th to 31st, which can be seen as a loop composition. Loop invariant: a[p...k-1] is sorted before each loop, and L[i] and r[j] are the smallest of the remaining elements in L and R two. Initial: K=p,a[p...p-1] is empty, so it is sorted and established. Hold: Before the K iterations, A[p...k-1] is sorted, and because L[i] and r[j] are the smallest of the remaining elements in L and R, so only the smallest elements in l[i] and r[j] can be placed in a[k], and k+1] is sorted before the A[P...K iterations, and L[i] and r[j] are the smallest remaining two elements. Termination: K=q+1, and A[P...Q] is sorted, and that's what we want, so the proof is over. Examples of merge sorts: Q: What are the disadvantages of merge sorting? A: He's out-place sort, so it takes a lot of extra space compared to a fast row. Q: Why is merge sorting slower than fast sorting? A: Although the asymptotic complexity is the same, the coefficient of merge sorting is larger than the fast one. Q: What are the improvements for merge sorting? A: The insertion sort is used when the array length is k, because the insertion sort is appropriate for sorting the decimal group. Introduced in the introduction of algorithms study Questions 2-1. The complexity is O (NK+NLG (n/k)), when K=o (LGN), the complexity is O (NLGN) Five, fast sequencing Tony Hoare was invented in 1962, known as "one of the ten classic algorithms of the 20th century." The partition of fast sequencing explained in the introduction of the algorithm is proposed by Lomuto, which makes some changes to the Hoare algorithm, and the introduction of the algorithm 7-1 introduces the Hoare of the fast line. Features: Unstable sort, in-place sort. Worst run time: When the input array is sorted, the time is O (n^2), which can of course be improved by randomization (shuffle array or randomized select pivot), making the expected run time O (NLGN). Best Running time: O (NLGN) The idea of quick sequencing is also divided. When all elements of the input array are the same, the complexity of fast sorting or randomization is O (n^2), whereas in study questions 7-2, in the third edition of the introduction to the algorithm, the complexity is improved by changing the partition function to O (n). Note: As long as the partition division is constant, then the efficiency of the Fast is O (NLGN), such as when the partition division ratio is 10,000:1 (enough unbalanced), the efficiency of the fast row or O (nlgn) "A killer adversary for Quicksort "This article is an interesting introduction to how to design an input array so that the quicksort run time is O (n^2). Pseudo code: The implementation of randomization partition: Improving the partition implementation of efficiency when all elements are the same: prove the correctness of the algorithm:The partition function proves that all elements of the loop invariant: a[p...i] are less than or equal to pivot,a[i+1...j-1] and that all of them are greater than pivot. Initial: i=p-1,j=p, so a[p...p-1]= empty, a[p...p-1]= empty, so established. Hold: All elements of the known a[p...i] are greater than pivot, in the loop body, if a[j]>pivot, then not moving, J + +, and all elements of the a[p...i] are less than All elements equal to pivot,a[i+1...j-1] are larger than pivot. -If A[j]<=pivot, then I++,a[i+1]>pivot, a[i+1] and A[j] are exchanged, a[p...i] keeps all elements less than or equal to pivot, and all elements of a[i+1...j-1] are larger than pivot. Terminate: J=r, so that all elements of a[p...i] are less than or equal to pivot,a[i+1...r-1] and are larger than pivot. Six, Heap sort 1964 Williams presented. Features: Unstable sort, in-place sort. Best time: O (NLGN) worst time: O (NLGN) This article describes the best time and worst-time proofs of heap sequencing: http://blog.csdn.net/xiazdong/article/details/8193625 thought: Using the minimum heap , the maximum heap of this data structure, and the heap can also be used to build priority queues. Priority queues are applied to inter-process scheduling, task scheduling, and so on. The heap data structure is applied to Dijkstra and prim algorithms. prove the correctness of the algorithm: (1) proves the correctness of build_max_heap: cyclic invariant: Before each cycle begins, a[i+1], a[i+2] 、...、 a[n] are the root of the largest heap, respectively. Initial: I=floor (N/2), then a[i+1] 、...、 a[n] are leaves, and therefore established. Hold: Before each iteration begins, known a[i+1], a[i+2] 、...、 a[n] are the root of the largest heap, respectively, in the loop body, because a[i] child's subtree is the largest heap, so after the execution of Max_heapify (A,i), A[i] is also the root of the largest heap, So keep the loop invariant. Termination: i=0, known as a[1] 、...、 a[n] are the root of the largest heap, got the a[1] is the root of the largest heap, so the certificate is completed. (2) proves the correctness of heapsort: cyclic invariant: Before each iteration, a[i+1] 、...、 A[n] contains the largest n-i element of a and a[i+1]<=a[i+2]<=...<=a[n], and a[1] is the largest in the heap. initial: i=n,a[n+1] ... A[n] is empty, set up. Hold: Before each iteration begins, a[i+1] 、...、 A[n] contains the largest n-i elements of a, and a[i+1]<=a[i+2]<=...<=a[n], the loop body will a[1] and A[i] exchange, because A[1] is the largest in the heap, so a [i] 、...、 A[n] contains the largest n-i+1 elements of a and a[i]<=a[i+1]<=a[i+2]<=...<=a[n], so the loop is invariant. Termination: I=1, known a[2] 、...、 A[n] contains the largest n-1 element of a and a[2]<=a[3]<=...<=a[n], so a[1]<=a[2]<=a[3]<=...<=a[n] , the certificate is completed. Count sort Features: stable sort, out-place sort. Worst case Run time: O (n+k) Best run Time: O (n+k) when K=o (n), count sort time is O (n) pseudo code: VIII, cardinality sort This article assumes that the sort of each bit is a count sort. Features: Stable sort, out-place sort. Worst case Run Time: O ((n+k) d) Best run Time: O ((n+k) d) When D is constant, K=o (n), efficiency is O (n) We don't necessarily want a sort of one, weCan be multi-bit multi-bit sort, for example, altogether 10 bits, we can first the low 5 bit sorts, then the high 5 bit sorts. Lemma: assuming n B-digits, dividing B-bits into multiple cells, and each cell is R-bit, the efficiency of the cardinality sort is o[(B/R) (N+2^r)]. When B=o (NLGN), R=LGN, Cardinal Sort efficiency O (n) for example Introduction to Algorithms exercise 8.3-4: Shows how to sort n integers between 0~n^2-1 in O (n) time? Answer: These are counted as 2 decimal, the number of digits is LG (N^2) =2lgn=o (LGN), so the lemma, B=o (LGN), and we set R=LGN, the cardinality sort can be ordered in O (n). Examples of cardinality sorting: prove the correctness of the algorithm:Through the loop invariant can be proved, prove slightly. The bucket ordering assumes that the elements of the input array are between [0,1]. Features: Out-place sort, stable sort. Worst case Run Time: When the distribution is uneven, all elements are divided into one bucket, O (n^2), of course [Introduction to the algorithm 8.4-2] can also change the insertion sort to heap sorting, fast sorting, etc., so the worst case is O (NLGN). Best case Run Time: O (n) Bucket sort example: Pseudo code:
prove the correctness of the algorithm:For any a[i]<=a[j], and A[i] falls on the b[a],a[j], we can see a<=b, and therefore the proof.

Summary of nine sorting algorithms

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More