Comparison and Selection of various internal sorting methods

Source: Internet
Author: User

Comparison and Selection of various internal sorting methods

Sort by average time into four categories:

(1) Order of squares (O (n2)
It is generally called simple sorting, such as direct insertion, Direct selection, and Bubble sorting;

(2) linear rank (O (nlgn)
Such as fast, heap, and Merge Sorting;

(3) Order of O (N1 + percentile)
Occurrence is a constant between 0 and 1, that is, 0 <occurrence <1, such as Hill sorting;

(4) linear order (O (N) sorting
Such as bucket, box, and base sorting.

Comparison of sorting methods

In simple sorting, it is best to insert directly and sort quickly. When the file is in positive order, both direct insertion and bubble are the best.

Factors Affecting sorting performance

Because different sorting methods adapt to different application environments and requirements, the following factors should be taken into account when selecting an appropriate sorting method:
① Number of records to be sorted N;
② Record size (size );
③ Keyword structure and its initial state;
④ Requirements on stability;
⑤ Conditions for language tools;
⑥ Storage structure;
7. Time and auxiliary space complexity.

Selection of sorting methods under Different Conditions

(1) If n is small (such as N ≤ 50), direct insertion or direct selection of sorting can be used.
When the record size is small, the direct insertion sorting is better; otherwise, the sorting should be selected because the number of records to be moved is less than the direct insertion.
(2) If the initial state of the file is basically ordered (positive), direct insertion, bubble or random quick sorting should be selected;
(3) If n is large, the time complexity is O (nlgn.
Quick sorting is the best method in comparison-based internal sorting. When the keyword to be sorted is a random distribution, the average time of quick sorting is the shortest;
The auxiliary space required for heap sorting is less than that for quick sorting, and the worst possible case for quick sorting is not displayed. The two sorting types are unstable.
If sorting is required to be stable, Merge Sorting is optional. However, this chapter does not recommend sorting algorithms that merge data from a single record. Generally, they can be used together with directly inserted sorting algorithms. Use Direct insertion of sorting to obtain a long ordered sub-file, and then merge the sub-files. Because direct insertion sorting is stable, the improved Merge Sorting is still stable.

(4) In the comparison-based sorting method, after comparing the two keywords each time, only two possible transfer occurs. Therefore, a binary tree can be used to describe the comparison and determination process.
When the N Keywords of a file are randomly distributed, any sort algorithm by means of "comparison" requires at least O (nlgn) time.
Box sorting and base sorting only one step will lead to M possible transfer, that is, to load a record into one of M boxes, so in general, box sorting and base sorting may complete sorting of N records in O (n) time. However, box sorting and base sorting are only applicable to keywords with obvious structural features such as strings and integers. When the value range of a keyword belongs to an infinite set (such as a real number keyword, the box sorting and base sorting cannot be used. In this case, you can only sort by "comparison.
If n is large, the number of keywords recorded is small and can be decomposed, the base sorting is better. Although bucket sorting has no requirement on the structure of keywords, the average time can reach the linear order only when the keywords are randomly distributed. Otherwise, the bucket sorting is the square order. At the same time, note that the box, bucket, and base allocation sorting assume that if the keyword is a number, the value is non-negative. Otherwise, map it to the box (bucket) to add the corresponding time.
(5) some languages (such as Fortran, COBOL, or basic) do not provide pointers and recursion, resulting in merging, fast (they are easy to implement with recursion) and base (with pointers) and other sorting algorithms become complex. In this case, you can consider using other sorting methods.
(6) The sorting algorithm given in this chapter stores the input data in a vector. When the record size is large, you can use a linked list as the storage structure to avoid a large amount of time to move the record. For example, insert sorting, Merge Sorting, and base sorting are easy to implement on the linked list to reduce the number of records moved. However, some sorting methods, such as fast sorting and heap sorting, are difficult to implement on the linked list. In this case, you can extract keywords to create an index table and then sort the index table. However, the simpler method is to bring up an integer vector T as an auxiliary table. Before sorting, t [I] = I (0 ≤ I R [T [0]. key ≤ r [T [1]. key ≤... ≤ r [T [n-1]. key
If the final result is:
R [0]. Key ≤ r [1]. Key ≤... ≤ r [n-1]. Key
After sorting, You can rearrange the records in the order specified in the secondary table. The time for completing this sort is O (n ).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.