All sorts of sorting methods summarize __ algorithm

Source: Internet
Author: User
Tags sorts time limit

Original: http://hxraid.javaeye.com/blog/646300

Let's first summarize the performance of the 10 methods in the Sort Structure topic 1-4 ((n-keyword sequence):


1, O (n^2) level of the common sorting algorithm, we use C + + random function of rand () generated by the random number of sorting, and calculate time consuming.

Which randomly generated 1w,3w,5w ... 19W (increment 2W) A total of 10 groups of rows to be ordered for testing. The time-consuming statistics of the direct insertion sort, binary insertion sort, bubble sort, and simple selection ordering are shown below (SPSS software makes graph statistics).

As you can see from the above illustration, the most time-consuming of the bubbles is the most time-consuming, and the other three take the same time. Where the binary insert sort is better than the direct insertion sort, and the simple sort, after the amount of data to be sorted is 19W. In addition, in the case of a small amount of data, the performance of the insertion sort is better than the selection sort.

Common algorithm Analysis: When the data scale is small (9W), binary inserts, inserts directly, the simple choice inserts almost. Binary inserts are better when the volume of data is large. And the time cost of the bubble sorting algorithm is the most expensive. In addition, the general sorting algorithm is basically the adjacent elements for comparison, so the O (n^2) basic sorting algorithm is stable.

2, O (N*LOGN) level of advanced sorting algorithm, its time complexity is much faster than the ordinary algorithm. Because the data itself is much smaller, so we do not compare them with the common algorithm, but instead to choose from the 10w--140w (increment 10W) of 15 sets of data for testing, the time consuming performance comparison is as follows (SPSS software map Statistics):

From the above figure, it can be found that the time-consuming cost of advanced sorting is much less than the ordinary sort algorithm. And there is a difference between the advanced algorithms. The quick sort of which is undoubtedly the best. Next is the merge sort and hill sort, the heap sort is slightly worse, and the worst is the tree selection sort.

Advanced Algorithm Analysis:

(1) In terms of time performance, Hill sort, quick sort, tree-type selection, heap-sorting and merge-ordering are all more advanced sorting methods. Algorithms that take far less than O (n^2) levels.


(2) Among the advanced algorithms, the efficiency of the fast Platoon is the highest. But its disadvantage is very obvious: in order to arrange the sequence of the situation, will degenerate into bubble sort, the time complexity of close to O (n^2).


(3) The performance of hill sorting is a bit of a surprise, the efficiency of this incremental insert sequencing completely illustrates: In the basic ordered sequence, the direct insertion sort can achieve astonishing efficiency. However, Hill ranking still has no satisfactory answer to the incremental selection criteria, knowing that incremental selection directly affects the efficiency of sorting.


(4) The efficiency of merge sort is very good, in the case of large data, it is better than hill sort and heap sort.


(5) heap sorting in the case of small data is still good performance, but as the size of the increase, the time cost began and the above two sorts of distance.


(6) Tree-type selection is not a good advanced sorting method, the larger the data scale, the higher the time consuming cost. And it needs more auxiliary space to reach O (N) level. To think about it, sorting 140W data requires additional 140W space, which is unbearable.


(7) Most of the advanced sorting has reduced the number of comparisons because of the jump-type comparison, but also sacrificed the stability of the order.

In general, there is no "best" sorting algorithm. The "good" algorithm must be chosen for the characteristics of the sequence itself. Here are some guiding comments:

(1) The data scale is very small, and in order to arrange the sequence is basically orderly, the choice of direct insertion sort is absolutely the best policy. Don't underestimate it O (n^2) level.

(2) The data scale is not very big, can use the memory space completely. and in order to arrange the sequence is disorderly (the more chaotic happier), the fast row is always a good choice, of course, the extra space to pay log (N) is worthwhile.

(3) Massive level of data, must be stored in the memory (disk). At this point the merge ordering is a relatively good algorithm.

Attached: The data of the above two graphs test the result of running under the Pentium 4 CPU 3.06GHZ, CPU occupancy rate 0%. In addition, the following is my test nine sorting algorithm C source code, for everyone to download and use.

★ A theory about O (N*LOGN) time limit of time consuming

Here's a question: isn't O (n*logn) The best time limit for sorting algorithms?

Of course not, but if the sorting algorithm is based on the "keyword comparison" operation, the best effect that can really be achieved in the worst case is O (N*logn). In the best case there is no need to say that, if the sequence is basically ordered, then the direct insertion of the sort of comparison is very few times.

Here's how to prove it (note: The basic operation of these sorting algorithms is to compare, the time is mainly consumed in the number of comparisons). Now there are three keywords K1, K2, K3. The following figure shows the decision tree for the three keywords recorded in any possible sort state, and the internal nodes in the tree are made a necessary comparison.

The sequence of three keywords is only sorted by the 6 described by the leaf node above. Each comparison of the decision tree is necessary. Therefore, this decision tree is sufficient to describe the sort process by comparison. Furthermore, the number of "comparisons" required for each sequence to be sorted to an ordered sequence is just the length of the path from the root to the leaf node. Therefore, a minimum of 3 keywords is required 2 times, up to 3 times.

To expand, there are n keyword sequences. Then there is the n! sort state, the natural decision tree has n! leaf node. We know that in the case of the tree height of the binary tree is h, the leaf node has 2^ (h-1). And now n! a leaf knot, then the tree is at least log (n!). +1. In other words, the decision tree that describes N records ordering must have a length of [log (n!) +1] path. According to the Stirling formula (n! 's high-precision Approximate solution formula): Log (n!) =n*log (N). Therefore, the minimum number of comparisons is N*log (N).

The lower limit of time complexity for sorting algorithms based on comparison operations is indeed O (N*logn). If not, then the time-consuming cost will not be further reduced. Of course, on this aspect of the sorting algorithm, please see "bucket sort", "Cardinal order".

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.