Look at the sort from different angles

Last Update:2017-10-13 Source: Internet

Author: User

Tags index sort

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1 Classification by sorting characteristics

First of all, according to the operation characteristics of the sorting itself can be divided into the following:

(1) Insert sort

A. Direct insert sort (insert sort) O (n^2) (Stable)

B. Binary insert sort (Binary insert sort) (unstable)

C. Hill sort (Shell sort) (unstable)

(2) Exchange sort

A. Bubble sort (Bubble sort) O (n^2) (Stable)

B. Fast sorting (quick sort)?? O (Nlogn) (unstable)

(3) Select sort

A. Direct select sort (select Sort) O (n^2) (unstable)

B. Tournament sequencing (tournament sort) O (Nlogn) (unstable)

C. Heap sort (heap sort) O (Nlogn) (unstable)
D. Merge sort (merge sort) O (Nlogn) stable, also depending on the sorting algorithm selected for each segment. For example, each length of more than 2 with a stable direct insertion sort, and then merge, not necessarily to the length of the granularity of 2 and then merge.

E. Cardinal sort (Radix sort) O (d (N+radix)) (Stability pending)

2. Features of each sorting algorithm:

(1 bubble sort : In the best case only a n-1 comparison can be obtained, the optimal case is that the sequence is a positive sequence, from the 100K positive sequence results can be seen as the result is so),

But in the worst case, the reverse (or a smaller value at the end), the sinking algorithm will require n (n-1)/2 comparisons.

So in general, especially in reverse order, it is not ideal. It is a sort algorithm that is very sensitive to the order of data.

Bubble Sort 2: It is the improvement of the bubbling sort (once sinking again floating), the best case and the worst case is similar to the bubbling sort, but generally it is better than bubble sort, it sinks once and then floats again, thus avoiding a huge comparison due to the reverse order of a number. such as (2,3,4,..., n-1,n,1), the bubble sort requires n (n-1)/2 comparisons, and this sort as long as 3 rounds, a total comparison (n-1) + (n-2) + (n-3) times, the first Round 1 will move up one bit, the second Round 1 will be moved to the first, the third round will find no data exchange, sequence and end.

But it is also a sort algorithm that is very sensitive to the order of data, which is only suitable for the basic order of data .

(2 quick sort: It is also an improvement of the bubbling sort, which eliminates multiple reverse orders through a single exchange, which reduces the number of scans and data exchanges that are consumed in reverse order.

In the best case, it has a sorting time complexity of O (nlog2n). That is, each time a sequence is divided, it can be evenly divided into two substrings.

But the worst case of its time complexity would be O (n^2).

That is, each time the substring is divided, one string is empty, and the other string is m-1 (100K positive and reverse order in the program is exactly the case, if the program uses the middle data of each fetch sequence as a dividing point, it will be in the positive sequence and inverse time to achieve optimal).

The quick sort is slower than bubble sort from the results of the 100K positive order, which is mainly the method of early ending sorting in bubble sort. Some books explain the "quick sort", in theory, if the sequence can be divided evenly each time, it will be the fastest sorting algorithm, so call it a quick sort. Although it is difficult to divide the sequence evenly, it is still the fastest in the internal sorting algorithm based on the keyword comparison in terms of average performance.

(3 Direct selection sort: simple selection Sort, it's a certain number of comparisons: N (n-1)/2. So no matter in the sequence of the circumstances, it will not have a good performance (from the 100K of positive and reverse order data can be found that it is time-consuming difference, the difference is only the data movement time), visible to the data is not sensitive to order. Although it is more than the number of times, but its data exchange rate is very small. So we will find it in general will be faster than bubble sort.

(4 heap sort: because it is based on the direct selection sort, the comparison results are used to form.) The efficiency is greatly improved. It completes the sort by the total number of comparisons to O (nlog2n). It is an algorithm that is insensitive to the order of data.

But the heap sort will need to do two steps:-is to build the heap, and the second is to sort (adjust the heap). Therefore, it is not suitable for small-scale sequences, but it will show superior performance for larger sequences.

(5 Direct Insert sort: A simple insert sort, which removes up to a reverse order after each comparison, so the efficiency of the bubble sort is the same.)

But it is still high in speed, because in the bubble sort is the value exchange, and in the insertion sort is the value movement, so the direct insertion sort will be better than the bubble sort.

The direct insertion method is also a kind of algorithm which is very sensitive to the order of data. In ordered cases, only a n-1 comparison is required, and in the worst case, n (n-1)/2 comparisons will be required.

(6 Hill Sort: The incremental selection will affect the efficiency of the hill sort.)

However, no matter how you choose the increment, make sure that the increment is 1 to make a direct insert sort. However, it is relative to the direct insert sort, because in each comparison in the child table, it is possible to remove multiple reverse orders from the entire warp table, thus improving the overall sorting performance. Hill sort is an algorithm based on the insertion sort, so it is sensitive to data order.

(7 merge sort: merge sort is a non-in-place sort that will need as much secondary space as the sequence to be sorted. Using it to merge two ordered sequences will have unparalleled advantages.

Its time complexity is O (nlog2n) in both the best and worst-case scenarios. Not sensitive to the order of the data. If data node data is large, it will not be suitable. But you can change the index operation, the effect will be very good.

(8 Cardinality sort: in the program is to use the decimal bits of the value decomposition, and then a one-time allocation of space, so it needs more auxiliary space (10*N+10),

(But we can do other decomposition, such as a single byte decomposition, the space using a linked list will only need to assist space n+256).

The time to sort the cardinality is linear (that is, O (n)). This shows that the cardinality ranking is very appealing, but it is not in-place sorting, if the node data volume is large to the index sort.

But the base order has a premise, to the keyword can be like integer, string so can be decomposed, if floating point type that is not.

Summary of the various sorting methods compared to simple sorting in the direct insertion of the best, fast sorting the fastest, when the file is positive sequence, the direct insertion and bubbling are the best.

3. Divide the sorting into four categories by average time

(1) The Order of the Square (O (N2)) is generally referred to as simple sorting, such as direct insertion, direct selection, and bubbling sorting;

(2) Linear logarithmic order (O (NLGN)) sorting such as fast, heap and merge sort;

(3) O (N1+￡) Order ordering ￡ is a constant between 0 and 1, i.e. 0<￡<1, such as Hill sort;

(4) Linear Order (O (n)) sorting such as barrels, bins, and cardinality.

4. Factors that affect the sorting effect

Because different sorting methods adapt to different application environments and requirements, the following factors should be taken into consideration when choosing the appropriate sorting method:

① number of records to be sorted n;

The size of the ② record (size);

The structure of the ③ keyword and its initial state;

④ the requirement of stability;

⑤ the conditions of the language tools;

⑥ storage structure;

⑦ time and the complexity of the auxiliary space.

5. Selection of sorting methods under different conditions

(1) If n is small (for example, n≤50), direct insertion or direct selection can be used.

When the record size is small, the direct insertion sort is better, otherwise, because the number of directly selected moving records is less than the direct interpolation, it is advisable to select the direct selection sort.

(2) If the initial state of the document is basically ordered (order), it should be used to directly plug, bubble or random quick ordering is appropriate;

(3) If n is larger, it should be sorted by the time complexity O (NLGN): Quick sort, heap sort, or merge sort.

Fast sorting is currently considered the best method based on the comparison of the internal sorting, and when the keywords to be sorted are randomly distributed, the average time for fast sorting is shortest;

Heap ordering requires less than a quick sort of secondary space, and there is no worst-case scenario in which a quick sort can occur. Both of these sorts are unstable.

If order stability is required, merge sort is optional.

However, the sorting algorithm introduced in this chapter for 22 merges from a single record is not worth advocating, and it can often be combined with a direct insert sort.

The long ordered sub-file is obtained by using the direct insert sort, and then 22 is merged.

Since the direct insert sort is stable, the improved merge sort is still stable.

6. Stability of sorting algorithms

1) Stable: If there are multiple records with the same sort code, and the relative order of these records remains unchanged after sorting, this sort algorithm is called stable.

Insert sort, bubble sort, merge sort, assign sort (bucket, cardinality) are all stable sorting algorithms.

2) Unstable: otherwise called unstable.

Direct selection of sorting, heap sorting, shell sorting, and fast sorting are all unstable sorting algorithms

Look at the sort from different angles

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More