Features, time complexity, and stability of various sorting algorithms

Source: Internet
Author: User
Select a stable Sorting Algorithm for sorting, fast sorting, Hill sorting, and heap sorting. Bubble sorting, insert sorting, Merge Sorting, and base sorting are stable sorting algorithms.


Bubble Method:
This is the most primitive and well-known slowest algorithm. The reason for his name is that its work seems to be bubbling: complexity is O (n * n ). When the data is in positive order, there will be no exchange. The complexity is O (0 ).

Direct insertion and sorting: O (N * n)

Select sorting: O (N * n)

Fast sorting: the average time complexity is log2 (n) * n, which is the highest among all internal sorting methods. In most cases, it is always the best.

Merge Sorting: log2 (n) * n

Heap sorting: log2 (n) * n

Hill sorting: the complexity of the algorithm is the 1.2 power of N.


I didn't give behavior analysis here, because this is very simple. Let's analyze the algorithm directly:

First, consider the ideal situation.
1. The size of the array is the power of 2, so that the split can always be divisible by 2. Assume that it is the k power of 2, that is, K = log2 (n ).
2. Each time we select a value that is just a median value, the array can be classified.
First layer recursion, loop N times, second layer loop 2*(N/2 )......
So there are n + 2 (n/2) + 4 (n/4) +... + N * (n/n) = N +... + n = K * n = log2 (n) * n
Therefore, the algorithm complexity is O (log2 (n) * n)
In other cases, it will only be worse than this case. The worst case is that the middle selected each time is the minimum or maximum value, so it will become the exchange method (because recursion is used, the situation is worse ). But what do you think is the probability of such a situation ?? You don't have to worry about this issue. Practice has proved that quick sorting is always the best in most cases.
If you are worried about this problem, you can use heap sorting, which is a stable O (log2 (n) * n) algorithm, but in general, the speed is slower than the fast sorting (because the heap needs to be reorganized ).

 

I have written the test several times over the past few days, and I have encountered a problem about the stability determination of common sorting algorithms. I often choose multiple answers, it is not an easy question to draw conclusions for me and those who are just as inaccurate as I do. Of course, if you have remembered what is stable in the data structure book and what is not stable before the written test, it should be easy to do.

This article is intended for those who can't remember this or want to understand why it is stable or unstable.

First, we should all know the stability of sorting algorithms, in general, it can ensure that the first two equal numbers are sorted in the same order before and after the sequence. In a simple form, if AI = AJ, the AI is in the front of the location, and the sorted AI is still in the front of the Aj location.

Next, let's talk about the benefits of stability. If the Sorting Algorithm is stable, sort from one key and then from another key. The result of the first key sorting can be used by the second key sorting. In this case, the base sorting is performed first by the low position, and then by the high position. The order of the elements with the same low position and the high position will not change at the same time. In addition, if the Sorting Algorithm is stable, for comparison-based sorting algorithms, the number of element exchanges may be less (personally, not confirmed ).

Go back to the topic and analyze the stability of common sorting algorithms. Each gives a simple reason.

(1) Bubble Sorting

Bubble Sorting is to call a small element forward or a large element backward. The comparison is an adjacent comparison between two elements, and the Exchange also occurs between these two elements. Therefore, if the two elements are equal, I think you will not be bored to exchange them. If the two equal elements are not adjacent, even if the two are adjacent through the previous two exchanges, at this time, the sequence of the same elements is not changed, so the Bubble Sorting is a stable sorting algorithm.

(2) Select sorting

The sorting mode selects the smallest element for each position. For example, you can specify the smallest element for the first position, select the second smallest element for the second element in the remaining element, and so on, the n-th element does not need to be selected until the n-th element, because only one of its largest elements is left. If the current element is smaller than an element, and the small element appears after an element equal to the current element, then the stability will be damaged after the exchange. Compare interfaces. For example, in the sequence 5 8 5 2 9, we know that the first selection of 1st elements 5 will exchange with 2, therefore, the relative order of the two 5 in the original sequence is damaged. Therefore, selecting sorting is not a stable sorting algorithm.

(3) Insert sorting
Insert sorting inserts an element at a time based on an ordered small sequence. Of course, at the beginning, this ordered small sequence had only one element, which was the first element. The comparison starts from the end of the ordered sequence, that is, the element to be inserted is compared with the already ordered sequence. If it is larger than it, it is directly inserted after it, otherwise, search until you find the inserted position. If you encounter an element that is equal to the inserted element, the inserted element is placed behind the element that you want to insert. Therefore, the order of equal elements is not changed, and the order from the original unordered sequence is the order after sorting, so insertion sorting is stable.

(4) Fast sorting
There are two directions for quick sorting. The I subscript on the left is always directed to the right. When a [I] <= A [center_index], center_index is the array subscript of the central element, it is generally set to an array of 0th elements. The J subscript on the right goes to the left, when a [J]> A [center_index]. If I and j cannot move, I <= J, exchange a [I] And a [J], repeat the above process until I> J. Exchange a [J] And a [center_index] to complete a quick sorting. When the central element is exchanged with a [J], it is very likely to disrupt the stability of the preceding elements. For example, the sequence is 5 3 3 4 3 8 9 10.
11. Now, the exchange of central element 5 and 3 (5th elements, subscript starting from 1) will disrupt the stability of element 3, so fast sorting is an unstable sorting algorithm, instability occurs when the central element is exchanged with a [J.

(5) Merge and sort
Merge Sorting refers to recursively dividing a sequence into short sequences. The recursive exit means that a short sequence has only one element (that is, directly ordered) or two sequences (one comparison and exchange ), then, the ordered segments are merged into an ordered long sequence until all the original sequences are sorted. It can be found that when one or two elements, one element will not be exchanged. If two elements are equal in size, no one will intentionally exchange them, which will not damage stability. So, in the process of merging short ordered sequences, is stability damaged? No. During the merge process, we can ensure that if the two current elements are equal, we store the elements in the previous sequence before the result sequence, thus ensuring stability. Therefore, Merge Sorting is also a stable sorting algorithm.

(6) Base sorting
Base sorting is sorted first by low position, then collected; then sorted by high level, and then collected; and so on until the highest bit. Sometimes some attributes have a priority order. They are first sorted by low priority and then by high priority. The final order is the highest priority, and the highest priority is the highest priority. Base sorting is based on separate sorting and collected separately, so it is a stable sorting algorithm.

(7) Shell)
Hill sorting sorts elements by insertion of different step sizes. When the elements are unordered at the beginning, the step size is the largest, so the number of elements inserted for sorting is very small and the speed is very fast; when the elements are basically ordered, the step size is very small, and insertion sorting is very efficient for ordered sequences. Therefore, the time complexity of hill sorting is better than that of O (N ^ 2. Because of the multiple insertion sorting, we know that one insertion sorting is stable and does not change the relative order of the same elements. However, in different insertion sorting processes, the same elements may move in their respective insert sorting, and the final stability will be disrupted, so shell sorting is unstable.

(8) Heap sorting
We know that the heap structure is that node I has 2 * I and 2 * I + 1 nodes, and the parent node must be greater than or equal to its 2 child nodes, the child top heap requires that the parent node be smaller than or equal to its two child nodes. In a sequence with a length of N, the heap sorting process starts from n/2 and selects the maximum value (large top heap) or the minimum value (small top heap) for its subnodes ), of course, the choice between these three elements will not undermine stability. However, when selecting elements for the n/2-1, n/2-2,... 1 parent nodes, the stability will be damaged. It is possible that the nth/2nd parent node swaps the next element, while the nth/2-1 parent node does not swap the next element, then the stability between the two identical elements is damaged. Therefore, heap sorting is not a stable sorting algorithm.

1. Quick Sort)

Quick sorting is a large-scale recursive algorithm that sorts data in the local area separately. Essentially, it is the local version of Merge Sorting. The quick sorting can be composed of the following four steps.

(1) If no more than one data entry is returned.
(2) generally, the leftmost value of the sequence is used as the pivot data.
(3) divide the sequence into two parts. One part is greater than the pivot data, and the other part is smaller than the pivot data.
(4) use recursive sorting sequence on both sides.

Quick sorting is faster than most sorting algorithms. Although we can write algorithms faster than quick sorting in some special cases, it is generally not faster than it. Quick sorting is recursive. It is not a good choice for machines with very limited memory.

2. mergesort)

Merge Sorting first breaks down the sequence to be sorted, divides it from 1 into 2, 2 into 4, and breaks it down in sequence. When there is only one group, these groups can be sorted, then merge the data into the original sequence, so that all data can be sorted. Merging and sorting is a little faster than heap sorting, but it requires more memory space than the heap sorting because it requires an additional array.

3 heapsort)

Heap sorting is suitable for scenarios with a large amount of data (Millions of data ).

Heap sorting does not require a large number of recursive or multi-dimensional temporary arrays. This is suitable for a sequence with a very large amount of data. For example, if there are more than millions of records, because of fast sorting, Merge Sorting uses recursion to design algorithms. When the data volume is very large, a stack overflow error may occur.

Heap sorting builds all the data into a heap. The largest data is on the heap top, and then the data on the heap top is exchanged with the last data in the sequence. Next, rebuild the heap and exchange data. Then, sort all the data.

4 shell sorting (shellsort)

Shell sorting divides data into different groups, sorts each group first, and then inserts and sorts all elements once to reduce the number of data exchanges and moves. The average efficiency is O (nlogn ). The rationality of grouping has an important impact on algorithms. Now we use the D. E. knuth grouping method.

Shell sorting is five times faster than Bubble sorting, and roughly two times faster than insert sorting. Shell sorting is much slower than quicksort, mergesort, and heapsort. However, it is relatively simple. It is suitable for scenarios where the data volume is below 5000 and the speed is not very important. It is very good for repeated sorting of series with a small amount of data.

5. insertsort)

Sort by inserting the values in the sequence into a sorted sequence until the end of the sequence. Insert sorting is an improvement on Bubble sorting. It is twice faster than Bubble sorting. Generally, you do not need to use insert sorting when the data is greater than 1000, or repeat the sequence of data items that exceed 200.

6. bubblesort)

Bubble Sorting is the slowest sorting algorithm. In practice, it is the most efficient algorithm. It compares every element in the array one by one, causing large data to sink and small data to rise. It is an O (N ^ 2) algorithm.

7. exchangesort and selectsort)

Both sorting methods are exchange method sorting algorithms, and the efficiency is O (n2 ). In practice, it is in the same position as Bubble sorting. They are only the initial stages of the development of sorting algorithms and are rarely used in practice.

Radixsort)

The base Sorting Algorithm and the general Sorting Algorithm do not follow the same route. It is a novel algorithm, but it can only be used for sorting integers. If we want to apply the same method to floating point numbers, we must understand the storage format of floating point numbers, it is very troublesome to map a floating point number to an integer in a special way and then map it back. Therefore, it is not used much. In addition, the most important thing is that such algorithms also require a large amount of storage space.

9 Summary

The following is a general table that roughly summarizes the features of all of our common sorting algorithms.

Sorting Method Average time Worst case Stability Extra space Remarks
Bubble O (n2) O (n2) Stability O (1) N hours is better
Exchange O (n2) O (n2) Unstable O (1) N hours is better
Select O (n2) O (n2) Unstable O (1) N hours is better
Insert O (n2) O (n2) Stability O (1) Most of the sorted items are better
Base O (logrb) O (logrb) Stability O (N)

B is a real number (0-9 ),

R is the base (10 hundred)

Shell O (nlogn) O (NS) 1 <S <2 Unstable O (1) S is the selected group
Fast O (nlogn) O (n2) Unstable O (nlogn) N is better
Merge O (nlogn) O (nlogn) Stability O (1) N is better
Heap O (nlogn) O (nlogn) Unstable O (1) N is better

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.