Overview
Sorting is to sort records in files in ascending (or descending) Order of keywords. If the keywords of the record to be sorted are not the same, the sorting result is unique. Otherwise, the sorting result is not unique.
In the file to be sorted, if multiple records with the same keywords exist, the relative sequence of these records with the same keywords remains unchanged after sorting, and the sorting method is stable; if the relative order between records with the same keywords changes, this sorting method is not stable.
Note that sortingAlgorithmThe stability of all input instances. That is, if one of all possible input instances does not meet the stability requirements, the Sorting Algorithm is unstable.
I. Insert sorting
The basic idea of insertion sorting is to insert the size of a record to be sorted according to its sorting code value in each step to the appropriate position in the preceding file until all data is inserted. Insert sorting mainly includes direct insert sorting and Hill sorting.
① Insert directly to sort (stable)
Insert sorting: When the I-th record is inserted, R1, R2 ,.. the Ri-1 has sorted the ki of the I-th record in sequence with R1, R2 ,.., the sort codes of the Ri-1 are compared one by one to find the appropriate location. Use Direct insert sorting. For files with N records, perform n-1 sort.
CodeAs follows:
Void dir_insert (int A [], int N) // directly Insert the sorting {Int J, t; for (INT I = 1; I <n; I ++) {T = A [I]; j = I-1; while (A [J]> T) {A [J + 1] = A [J]; j --;} A [J + 1] = T ;}}
②. Hill sorting (unstable ):
The basic principle of shell sorting is: Take an integer D1 less than N as the first increment and divide all the records of the file into D1 groups. All records whose distance is multiples of D1 are placed in the same group. Insert and sort directly in each group. Then, obtain the second incremental D2 <d1 repeat the preceding grouping and sorting until the incremental Di is 1, that is, all records are placed in the same group for direct insertion sorting. This method is essentially a grouping insertion method.
Generally, d1 = n/2, di + 1 = di/2. If the result is an even number, add 1 to ensure that Di is an odd number.
Hill sorting is unstable. The execution time of hill sorting depends on the incremental sequence, and its average time complexity is O (n ^ 1.3 ).
The Code is as follows:
Void shell (int A [], int N) // shell sorting {int I, J, K, T; (n/2) % 2 = 0? K = n/2 + 1: K = n/2; // ensure that the increment is odd while (k> 0) {for (j = K; j <N; j ++) {T = A [J]; I = J-K; while (I> = 0 & A [I]> T) {A [I + k] = A [I]; I = I-K;} A [I + k] = T;} If (k = 1) break; (K/2) % 2 = 0? K = K/2 + 1: K = K/2 ;}}
2. Select sorting
The basic idea of sorting is to select the record with the smallest sorting code from the record to be sorted in each step. The sequence is stored after the sorted record sequence until all the records are sorted. In sorting, you can directly select sorting and heap sorting.
① Directly select sorting (unstable)
The process of directly selecting sorting is as follows: first, select the record with the smallest sequence code among all records, exchange it with the first 1st records, and then select the record with the smallest sorting code in the remaining records, exchange with 2nd records ...... and so on until all records are sorted.
Regardless of the initial state of the file, the minimum keyword record is selected in the "I" sorting. N-I Comparison is required. Therefore, the total number of comparisons is n (n-1) /2 = O (N ^ 2 ). When the initial file is in positive order, the number of moves is 0. When the initial file state is in reverse order, the switch operation is executed for each sort. the maximum number of moves is 3 (n-1 ). The average time complexity of sorting is O (n ^ 2 ). Direct sorting is unstable.
The Code is as follows:
Void dir_choose (int A [], int N) // directly select the sort {int K, t; for (INT I = 0; I <n-1; I ++) {k = I; for (Int J = I + 1; j <n; j ++) {if (a [J] <A [k]) k = J ;} if (K! = I) {T = A [I]; A [I] = A [k]; A [k] = T ;}}}
②. Heap sorting (unstable)
Heap sorting is a kind of tree-based sorting that effectively improves the Direct selection and sorting. N keyword Sequences
K1, K2 ,..., kn is called a heap. if and only when the sequence is satisfied (KI <= k2i and Ki <= k2i + 1) or (KI> = k2i and Ki> = k2i + 1 ), (1 <= I <= n/2 ). The keyword of the root node (heap top) is the smallest keyword of all nodes in the heap, called the small root heap. The keyword of the root node is the one among all node keywords in the heap, called the big root heap.
If we store the vector R [1 .. n] as a storage structure of a Complete Binary Tree, the heap is essentially a Complete Binary Tree that meets the following requirements: the keywords of any non-leaf node in the tree are not greater than (or not less) keywords of the left and right children (if any) nodes.
Two key steps are involved in heap sorting: first, how to establish an initial heap; and second, when the root node of the heap is exchanged with the last node of the heap, how to adjust the node sequence after one node is missing so that it becomes a heap. The worst time complexity of heap sorting is O (nlog2n), and the average performance of heap sorting is closer to the worst performance. Because the initial heap requires a large number of comparisons, the heap sorting is not suitable for files with fewer records. Heap sorting is a local sorting method with the auxiliary space of O (1). It is an unstable sorting method.
Code omitted ..
Iii. Exchange sorting
The basic idea of exchanging sorting is to compare the sort codes of the records to be sorted in two pairs, and exchange the write-even pairs that do not meet the order requirements until the conditions are met. The main methods for exchanging sorting are Bubble sorting and fast sorting.
①. Bubble Sorting (stable)
Bubble Sorting vertically sorts the sorted record array R [1. N]. Each record R [I] is considered as a bubble with a weight of Ki. According to the principle that a Light Bubble cannot be under a heavy bubble, the array R is scanned from the bottom up. Any Light Bubble scanned to a violation of this principle will make it "float" up ". This is repeated until the last two bubbles are light and heavy.
The process of Bubble Sorting is as follows:
The first step is to compare K1 and K2. If k1> k2, the records where K1 and K2 are located are exchanged; otherwise, the records are not exchanged. Repeat the above process for K2 and K3 until the kn-1 and kN are processed. At this time, the largest sorting code record is transferred to the last position, which is called 1st bubbles. A total of N-1 comparisons are executed.
Similar to step 1, A kn-2 comparison is performed from K1 and K2 to kn-1 and N-2.
And so on, perform n-1 bubbles to complete the sorting process.
If the initial state of the file is in positive order, a scan can complete the sorting. The number of keyword comparisons is n-1, and the number of records moved is 0. Therefore, the best time complexity of Bubble Sorting is O (n ).
If the initial file is in reverse order, n-1 sort is required. For each sort, we need to compare the n-I keywords (1 <= I <= N-1), and each comparison must move the record three times to reach the position of the exchange record. In this case, the number of comparisons reaches the maximum N (n-1)/2 = O (N ^ 2), and the number of moves reaches the maximum value 3n (n-1) /2 = O (N ^ 2 ). Therefore, the worst time complexity of Bubble Sorting is O (n ^ 2 ).
Although the Bubble Sorting does not have to be n-1, the average performance is much worse than the direct insertion sorting because of the large number of records moving. The Bubble Sorting is in-place and stable.
The Code is as follows:
Void QP (int A [], int N) // The optimized Bubble Sorting {int COUNT = 0, T, flag; For (INT I = 0; I <n-1; I ++) {flag = 0; For (Int J = 0; j <n-I; j ++) {if (a [J + 1] <A [J]) {T = A [J]; A [J] = A [J + 1]; A [J + 1] = T; flag = 1; count + = 3 ;}} if (flag = 0) Break ;}}
②. Fast sorting: (unstable)
Quick sorting adopts a divide-and-conquer strategy, which is usually called the divide-and-conquer method. Its basic idea is to break down the original problem into several subproblems with smaller sizes but similar structures as the original problem. Recursively solve these subproblems, and then combine the solutions of these subproblems into the solutions of the original problem.
The specific process of quick sorting is as follows:
The first step is to take a record from the N records to be sorted. Based on the sorting code of the record, all records are divided into two groups, the sorting codes of the 1st group of records are smaller than or equal to the sorting codes, and the sorting codes of the 2nd group of records are greater than the sorting codes, and the records are placed in the middle of the two groups.
Step 2: use the same method to sort the groups on the left and those on the right until all records are ranked at the corresponding position.
The Code is as follows:
Void quick_sort (int A [], int low, int high) // low and high are the subscript of the array {If (low
Iv. Merge Sorting
Merging and sorting combines two or more ordered sub-tables into a new ordered table. Initially, the sequence to be sorted Containing N nodes is considered to be composed of N ordered sub-tables with a length of 1, and they are merged into several ordered sub-tables with a length of 2 in turn, then merge them. The sorting ends until an ordered table with the length of N is obtained.
Merge Sorting is a stable sorting. It can be stored in sequence and is easy to implement on the linked list. For files with N length, log2n dual-path merging is required, the time for each merge is O (n), so the time complexity is O (nlog2n) in both the best and worst cases ). Merging and sorting requires an auxiliary vector to store the results of merging two ordered sub-files. Therefore, the complexity of the auxiliary space is O (n). Obviously, it is not local sorting.
Code omitted...
5. Base sorting
Set the value range of each component of a single keyword to C0 <= kJ <= Crd-1 (0 <= j <= RD). The number of possible values RD is called the base. the selection of the base and the decomposition of the keywords vary according to the type of the keywords.
(1). If the keyword is a decimal integer, It is decomposed by number or ten equal digits. The base RD = 10, C0 = 0, C9 = 9, and D are the digits of the longest integer.
(2) If the keyword is a lowercase English string, RD = 26, C0 = 'A', C25 = 'Z', D is the length of the longest string.
The basic idea of base sorting is to distribute and collect key codes in order from low to high. After D Allocation and collection, an ordered sequence can be obtained.
Sort by average time into four categories:
(1) Order of squares (O (n2)
It is generally called simple sorting, such as direct insertion, Direct selection, and Bubble sorting;
(2) linear rank (O (nlgn)
Such as fast, heap, and Merge Sorting;
(3) Order of O (N1 + percentile)
Occurrence is a constant between 0 and 1, that is, 0 <occurrence <1, such as Hill sorting;
(4) linear order (O (N) sorting
Such as base sorting.
Comparison and Selection of sorting:
Comparison of various sorting methods:
In simple sorting, it is best to insert directly and sort quickly. When the file is in positive order, both direct insertion and bubble are the best.
Factors Affecting sorting performance
Because different sorting methods adapt to different application environments and requirements, the following factors should be taken into account when selecting an appropriate sorting method:
① Number of records to be sorted N;
② Record size (size );
③ Keyword structure and its initial state;
④ Requirements on stability;
⑤ Conditions for language tools;
⑥ Storage structure;
7. Time and auxiliary space complexity.
Selection of sorting methods under different conditions:
(1) If n is small (such as N ≤ 50), direct insertion or direct selection of sorting can be used.
When the record size is small, the direct insertion sorting is better; otherwise, the sorting should be selected because the number of records to be moved is less than the direct insertion.
(2) If the initial state of the file is basically ordered (positive), direct insertion, bubble or random quick sorting should be selected;
(3) If n is large, the time complexity is O (nlgn.
Quick sorting is the best method in comparison-based internal sorting. When the keyword to be sorted is a random distribution, the average time of quick sorting is the shortest;
The auxiliary space required for heap sorting is less than that for quick sorting, and the worst possible case for quick sorting is not displayed. The two sorting types are unstable.
If sorting is required to be stable, Merge Sorting is optional. However, the Sorting Algorithm for merging two or two records from a single record is not worth advocating. It can usually be used together with direct insertion sorting. Use Direct insertion of sorting to obtain a long ordered sub-file, and then merge the sub-files. Because direct insertion sorting is stable, the improved Merge Sorting is still stable.