Sorting and algorithm Summary

Last Update:2018-12-03 Source: Internet

Author: User

Tags sorts

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Basic Ideas:

Each trip selects the smallest (or largest) element from the data element to be sorted, and places it at the end of the sorted series until all the data elements to be sorted are arranged.

2. Sorting Process:

[Example ]:

Initial keyword [49 38 65 97 76 13 27 49]

13 [38 65 97 76 49 27 49] after the first round of sorting

13 27 after the second round of sorting [65 97 76 49 38 49]

13 27 38 after the third round of sorting [97 76 49 65 49]

13 27 38 49 after the fourth round of sorting [49 97 65 76]

After the fifth sorting, 13 27 38 49 [97 97 76]

13 27 38 49 49 76 [76 97]

13 27 38 49 76 76 [97]

Last sorting result 13 27 38 49 49 76 76 97

void selectionSort(Type* arr,long len){   long i=0,j=0;/*iterator value*/   long maxPos;   assertF(arr!=NULL,"In InsertSort sort,arr is NULL\n");   for(i=len-1;i>=1;i--)   {             maxPos=i;             for(j=0;j<i;j++)                  if(arr[maxPos]<arr[j])maxPos=j;             if(maxPos!=i)swapArrData(arr,maxPos,i);   }}

Select the first-level loop of the sorting method from the starting element to the second-to-last element. The subscript of the outer loop is assigned to the temporary variable before each second-level loop, in the next layer 2 loop, if an element is found to be smaller than the element at the smallest position, the subscript of the smaller element is assigned to the temporary variable. Finally, after a L2 loop exits, if the temporary variable changes, it indicates that there are elements smaller than the current outer loop location, and the two elements need to be exchanged.

Ii. Insert directly to sort

The basic idea of insertion sort is to insert a record to be sorted into the appropriate position of the subfile in the preceding sorted order according to its keyword size, until all records are inserted.

Insert sort directly

Insert sort (straight insertion sort): Insert a record to an ordered table in the sorted order to obtain a new ordered table with 1 Increase in the number of records.

Insert sorting algorithm directly

The Sentinel (monitoring record) has two functions: one is to store the copy of the R [I] (the keyword to be compared currently) as a temporary variable; second, it is used in the search loop to monitor whether the subscript variable J exceeded.

When the initial State of a file is different, it takes a lot of time to insert the sort directly. The best case is that the initial state of the file is in positive order, and the time complexity of the algorithm is O (n). The worst case is that the initial state of the file is in reverse order, and the corresponding time complexity is O (n2 ), the average time complexity of the algorithm is O (n2 ). The complexity of the auxiliary space of the algorithm is O (1), which is a local sorting.

Directly inserting sorting is a stable sorting method.

Iii. Bubble Sorting

[Algorithm IDEA]: vertically sorts the sorted record array R [1. N]. Each record R [I] is considered as a bubble with the weight of R [I]. Key. According to the principle that a Light Bubble cannot be under a heavy bubble, scan the array R from the bottom up: Any Light Bubble scanned to a violation of this principle will make it "float" up ". This is repeated until the last two bubbles are light and heavy.

[Algorithm]:

Void bubblesort (seqlist R) {// R (L .. n) is the file to be sorted. It uses bottom-up scanning to perform Bubble Sorting on R: int I, j; Boolean exchange; // exchange flag for (I = 1; I <n; I ++) {// do a maximum of N-1 sort exchange = false; // The exchange flag should be false for (j = n-1; j> = I; j --) // for the current unordered zone R [I .. n] scanning from bottom up if (R [J + 1]. key <R [J]. key) {// exchange record R [0] = R [J + 1]; // R [0] is not a sentinel, only the temporary storage unit R [J + 1] = R [J]; R [J] = R [0]; exchange = true; // exchange occurred, therefore, the switch flag is set to true} If (! Exchange) return; // This sorting has not been exchanged. Terminate the algorithm in advance} // endfor (External Loop)} // bubblesort

[Analysis]: The final condition of Bubble Sorting is: the last line does not have "Exchange ". From the Bubble sorting process, we can see that the Bubble Sorting is a process of increasing the length of the ordered sequence and narrowing down the length of the disordered sequence, the length of the unordered sequence is reduced by 1. [Algorithm IDEA]: vertically sorts the sorted record array R [1. N]. Each record R [I] is considered as a bubble with the weight of R [I]. Key. According to the principle that a Light Bubble cannot be under a heavy bubble, scan the array R from the bottom up: Any Light Bubble scanned to a violation of this principle will make it "float" up ". This is repeated until the last two bubbles are light and heavy.

[Algorithm]:

Void bubblesort (seqlist R) {// R (L .. n) is the file to be sorted. It uses bottom-up scanning to perform Bubble Sorting on R: int I, j; Boolean exchange; // exchange flag for (I = 1; I <n; I ++) {// do a maximum of N-1 sort exchange = false; // The exchange flag should be false for (j = n-1; j> = I; j --) // for the current unordered zone R [I .. n] scanning from bottom up if (R [J + 1]. key <R [J]. key) {// exchange record R [0] = R [J + 1]; // R [0] is not a sentinel, only the temporary storage unit R [J + 1] = R [J]; R [J] = R [0]; exchange = true; // exchange occurred, therefore, the switch flag is set to true} If (! Exchange) return; // This sorting has not been exchanged. Terminate the algorithm in advance} // endfor (External Loop)} // bubblesort

Iv. Hill sorting

Basic Idea:

Take an integer D1 smaller than N as the first increment, and divide all records of the file into D1 groups. All records whose distance is a multiple of DL are placed in the same group. Sort directly inserted persons in each group first, and then take the second incremental D2 <d1 repeat the preceding grouping and sorting, until the incremental dt = 1 (dt <DT-L <... <D2 <d1), that is, all records are placed in the same group for direct insertion sorting.

This method is essentially a grouping insertion method.

Shell sorting process for a given instance

Assume that the file to be sorted has 10 records, and the keywords are:

49,38, 65,97, 13.

The incremental sequence values are as follows:

5, 3, 1

Shell Sorting Algorithm Implementation

1. Algorithm Description without monitoring record

Void shellpass (seqlist R, int d) {// a sort in the hill sort. D is the current incremental for (I = d + 1; I <= N; I ++) // set R [d + 1 .. n] Insert the current ordered zone if (R [I]. key <R [I-d]. key) {R [0] = R [I]; j = I-d; // R [0] is only a temporary storage unit, not the Insert Location R [J + D]; = R [J] For the record do {// search for R [I]; // The record J = J-D; // search for the previous record} while (j> 0 & R [0]. key <R [J]. key); R [J + D] = R [0]; // insert R [I] to the correct position} // endif} // shellpassvoid shellsort (seqlist R) {int increment = N; // incremental initial value. Set n> 0 do {increment = increment/3 + 1; // calculate the next incremental shellpass (R, increment ); // shell insertion sorting with one increment of increment} while (increment> 1)} // shellsort

Note:

When incremental d = 1, shellpass and insertsort are basically the same, but a cycle criterion "j> 0" is added to the insertsort because there is no Sentel to prevent the subscript from crossing the border.

2. Set the shell Sorting Algorithm for monitoring record

Algorithm Analysis

1. incremental sequence Selection

The execution time of shell sorting depends on the incremental sequence.

Common Features of a good incremental sequence:

① The last increment must be 1;

② Avoid the mutual multiples of values (especially adjacent values) in the sequence.

A large number of experiments have been conducted to show the current good results: WHEN n is large, the number of comparisons and moves is between nl.25 and 1.6n1.25.

2. Shell sorting has better time performance than direct insertion sorting

The reason why the time performance of hill sorting is better than that of direct insertion sorting:

① When the initial state of the file is basically ordered, the number of comparisons and moves required for direct insertion sorting is relatively small.

② When the N value is small, the difference between N and N2 is also small, that is, the best time complexity of direct insertion sorting O (N) and the worst time complexity 0 (N2) are not much different.

③ At the beginning of the hill sorting, there were a large increase in the number of groups and a small number of records in each group. Therefore, the number of records in each group was rapidly inserted. Later, the incremental di gradually reduced and the number of groups gradually reduced, the number of records in each group gradually increased, but because the di-1 has been used as the distance sorting order, the file is closer to the orderly state, so the new sorting process is also faster.

Therefore, the efficiency of hill sorting is much higher than that of direct insertion sorting.

3. Stability

Hill sorting is unstable. For more information, see the preceding example. In this example, the relative order of two identical keywords 49 changes before and after sorting.

5. Heap sorting

1. Definition of heap sorting

N key word sequences KL, K2 ,..., Kn is called a heap, and only when the sequence meets the following properties (referred to as heap properties ):

(1) KI ≤ k2i and Ki ≤ k2i + 1 or (2) KI ≥ k2i and Ki ≥ k2i + 1 (1 ≤ I ≤)

If we store the vector R [1 .. n] as a full binary tree storage structure, the heap is essentially a Complete Binary Tree that meets the following requirements: the keywords of any non-leaf node in the tree are not greater than (or not less) keywords of the left and right children (if any) nodes.

[Example] the keyword sequences (,) and (,) satisfy the heap properties (1) and (2) respectively, so they are all heap, the corresponding full binary tree is shown in the example of the small root heap and the example of the large root heap.

2. Large and small heaps

The keyword of the root node (also known as the heap top) is the heap of the smallest node keyword in the heap.

The keyword of the root node (also known as the heap top) is the publisher of all node keywords in the heap, known as the Big root heap.

Note:

① Any subtree In the heap is also a heap.

② The heap discussed above is actually binary heap, which can be defined similarly.

3. Heap sorting features

Heapsort is a tree-based sorting method.

The characteristic of heap sorting is that during the sorting process, R [L .. n] As a Complete Binary Tree ordered storage structure, using the inherent relationship between the parent and child nodes in the Complete Binary Tree [see the binary tree ordered storage structure ], select the record with the maximum or minimum keyword in the unordered area.

4. Differences between heap sorting and direct insert sorting

Directly select the sort, In order .. n] to select the record with the smallest keyword. The record must be compared n-1 times, and then in R [2 .. n] in the selection of the minimum keyword record, and need to do a N-2 comparison. In fact, many comparisons in the next N-2 comparison may have been done in the previous n-1 comparison, but since these comparison results were not retained in the previous sort, therefore, these comparison operations are repeated during the next sorting.

Partial comparison results can be saved in a tree structure to reduce the number of comparisons.

5. Heap sorting

Heap sorting utilizes the largest (or least) keyword of the top record of a large root heap (or a small root heap) so that the largest (or least) keyword is selected in the current unordered area) keyword record becomes simple.

(1) the basic idea of sorting with big roots

① First build the initial file R [1. N] into a large root heap, which is the initial unordered Zone

② Then exchange the record R [1] (heap top) with the last record R [N] In the unordered zone, and obtain the new unordered zone R [1 .. n-1] And the ordered zone R [N], and meet the requirements of R [1 .. n-1]. keys ≤ r [N]. key

③ Because the new root R [1] After the swap may violate the heap nature, the R [1 .. n-1] of the unordered zone should be changed to the heap. Then re-set R [1 .. in n-1], the record R [1] with the largest keyword is exchanged with the last record R [n-1] In the interval, and a new unordered zone R [1 .. n-2] and ordered zone R [n-1 .. n], and still satisfies the relationship R [1 .. n-2]. keys ≤ r [n-1 .. n]. keys, also set R [1 .. n-2] adjusted to heap.

......

Until there is only one element in the unordered area.

(2) Basic operations on the Sorting Algorithm of the big root heap:

① Initialization operation: Create R [1. N] as the initial heap;

② Basic operations for sorting each trip: swap the top record R [1] of the unordered zone with the last record in the interval, then adjust the new unordered area to the heap (also weigh and build the heap ).

Note:

① The large n-1 keywords can be selected to sort the files in ascending order.

② Sorting with a small root heap is similar to using a large root heap, but the sorting result is descending and ordered. Opposite to Direct selection: at any time, unordered areas in heap sorting are always before the ordered areas, the ordered area gradually expands from the back to the end of the original vector to the end of the whole vector.

(3) Heap Sorting Algorithm:

Void heapsort (seqiast R) {// pair R [1 .. n] for heap sorting, use R [0] as the temporary storage unit int I; buildheap (r); // build R [1-N] into the initial heap for (I = N; i> 1; I --) {// for the current unordered zone R [1 .. i] perform heap sorting and perform n-1 queries in total. R [0] = R [1]; R [1] = R [I]; R [I] = R [0]; // replace heapify (R, 1, I-1) with the last record in the heap and heap; // Replace r [1 .. i-1] retuned to heap, only R [1] may violate heap nature} // endfor} // heapsort

(4) Implementation of buildheap and heapify Functions

Because the operation to adjust the heap must be used to construct the initial heap, the implementation of heapify is discussed first.

① Heapify function ideology and Methods

R [L .. i] is a heap with R [1] as the root. After R [1] is exchanged with R [I], the new unordered zone R [1 .. in I-1], only the value of R [1] has changed, so except R [1] may violate the heap nature, the subtree with any other node as the root is heap. Therefore, when the range to be adjusted is R [low... high], you only need to adjust the tree with R [low] as the root.

"Adjust Heap"

The left and right subtree (if any) of R [low] are already heap, the root R [2low] and R [2low + 1] of the two Subtrees are the largest nodes of the keywords in their Subtrees. If R [low]. if the key is not less than the keywords of the two child nodes, R [low] does not violate the heap nature. The tree with R [low] as the root is already heap and does not need to be adjusted; otherwise, you must exchange the keyword between R [low] and the two child nodes, that is, R [low] and R [large] (R [large]. key = max (R [2low]. key, R [2low + 1]. key. After switching, the node R [large] may violate the heap nature. Because the two Subtrees of the node (if any) are still heap, the above adjustment process can be repeated, adjust the tree with R [large] as the root. This process ends until the currently adjusted node meets the heap nature, or the node is already a leaf. The above process is like filtering through a sieve. The smaller keywords are screened layer by layer, and the larger keywords are selected layer by layer. Therefore, someone calls this method "Limit Method ".

② Buildheap implementation

To adjust the initial file R [L. N] to a large root heap, the Child tree corresponding to the Complete Binary Tree must be changed to a heap with each node as the root.

Obviously, the tree with only one node is a heap, and in the Complete Binary Tree, all the node numbers are leaves, so the subtree with these nodes as the root is already a heap. In this way, we only need to sequentially set the sequence number to-1 ,..., You can change the node 1 as the root subtree to heap.

For more information about the algorithm, see tutorial ].

5. Sorting instances of large root heaps

For keyword sequences (,), see the change of the Complete Binary Tree and its storage structure during the heap process.

6. Algorithm Analysis

The heapify time is mainly composed of the time overhead of establishing the initial heap and re-building the heap. Both of them are implemented by calling heapify.

The worst time complexity of heap sorting is O (nlgn ). The average performance of heap sorting is closer to the worst performance.

Because the initial heap requires a large number of comparisons, the heap sorting is not suitable for files with a small number of records.

Heap sorting is local sorting, and the auxiliary space is O (1 ),

It is an unstable sorting method.

6. Quick sorting

The basic idea of fast sorting is: first we select a median value Middle (we can use the median value in the Program), and place the median value smaller than the median value on its left, and the median value greater than the median value on its right. Due to the complexity of this sorting algorithm, we first provide a program framework for sorting (available from various data structures ):

Void quicksort (int * pdata, int left, int right) {int I, j; int middle, itemp; I = left; j = right; middle = pdata [(left + right)/2]; // calculates the median do {While (pdata [I] <middle) & (I <right )) // The number of scans from the left that is greater than the median I ++; while (pdata [J]> middle) & (j> left )) // right-side scan of the number less than the value J --; if (I <= J) // a pair of values is found {// itemp = pdata [I]; pdata [I] = pdata [J]; pdata [J] = itemp; I ++; j -- ;}} while (I <= J ); // If the subscripts of the two scans are staggered, stop (completed once) // when the left part has a value (left <j), recursive left half side if (left <j) quicksort (pdata, left, J); // when the right part has a value (Right> I), recursive right half side if (Right> I) quicksort (pdata, I, right );}

For n members, the quick sorting method is about N * logn, and the number of exchanges is about N * logn/6. If n is 100, the bubble method requires 4950 comparisons, and the fast sorting method requires only 200. The efficiency of the quick sorting method is indeed high. The performance of the quick sorting method is closely related to the selection of the median value. If the median value is the maximum value (or the minimum value), the speed of the algorithm will be greatly reduced. In the worst case, the time complexity of the quick sorting algorithm is O (n2), while the average time complexity is O (n * logn ).

VII. Merge and sort

Description

The sort method previously introduced is sorted in the same batch column. In the test, there are two or more databases today, which may be resources in different columns, or how to sort the data in different cases?

Solution

You can use the combined sort method. The combined sort method is used to combine and sort the sorted data of the two shards. If the collected data has not been sorted, you can use other sorting methods to process these two types of data, and then combine the sorted data.

Someone asked, if there is no sorting order between the two databases, why not sort all the data and sort it again? The sorting spirit is to speed up the sorting efficiency by taking advantage of the sorted parts of data. The sorting efficiency of small-sized data items is fast. If the sorting of small-sized data items is completed, when combined, because the two databases are sorted, all the data in the combined sorting will be more efficient than simply adding all the data.

So can we directly use the combined sort method to process the entire sorting task? Instead of using other sorting methods? The answer is yes, as long as all the words are not divided into two equal points until the last digit is one, then, the operator that does not match is shown as follows:

However, the Division will basically take a long time, so it is better to sort small data resources by other sort methods, and then use the sort to improve efficiency.

In the following example, we use the quick sort method to process small-sized data sorting, and then use the combined sort method to process the combined operation.

Example

# Include <stdio. h> # include <stdlib. h> # include <time. h> # define max1 10 # define max2 10 # define swap (x, y) {int t; t = x; X = y; y = T ;} int partition (INT [], Int, INT); void quicksort (INT [], Int, INT); void mergesort (INT [], Int, int [], Int, int []); int main (void) {int number1 [max1] = {0}; int number2 [max1] = {0 }; int Number3 [max1 + max2] = {0}; int I, num; srand (Time (null); printf ("Before sorting :"); printf ("\ nnumber1 []:"); for (I = 0; I <max1; I ++) {number1 [I] = rand () %100; printf ("% d", number1 [I]);} printf ("\ nnumber2 []:"); for (I = 0; I <max2; I ++) {number2 [I] = rand () % 100; printf ("% d", number2 [I]);} // first sort two pieces of information quicksort (number1, 0, MAX1-1); quicksort (number2, 0, MAX2-1); printf ("\ n sorted:"); printf ("\ nnumber1 []:"); for (I = 0; I <max1; I ++) printf ("% d", number1 [I]); printf ("\ nnumber2 []:"); for (I = 0; I <max2; I ++) printf ("% d", number2 [I]); // merge sort mergesort (number1, max1, number2, max2, Number3 ); printf ("\ n after merging:"); for (I = 0; I <max1 + max2; I ++) printf ("% d ", number3 [I]); printf ("\ n"); Return 0;} int partition (INT number [], int left, int right) {int I, j, S; S = number [right]; I = left-1; for (j = left; j <right; j ++) {If (number [J] <= s) {I ++; swap (number [I], number [J]) ;}} swap (number [I + 1], number [right]); return I + 1;} void quicksort (INT number [], int left, int right) {int Q; If (left <right) {q = partition (number, left, right); quicksort (number, left, q-1); quicksort (number, q + 1, right) ;}} void mergesort (INT number1 [], int m, int number2 [], int N, int Number3 []) {int I = 0, j = 0, K = 0; while (I <M & J <n) {If (number1 [I] <= number2 [J]) Number3 [k ++] = number1 [I ++]; elsenumber3 [k ++] = number2 [J ++];} while (I <m) Number3 [k ++] = number1 [I ++]; while (j <n) Number3 [k ++] = number2 [J ++];}

Java

public class MergeSort {public static int[] sort(int[] number1,int[] number2) {int[] number3 =new int[number1.length + number2.length];int i = 0, j = 0, k = 0;while(i < number1.length && j < number2.length) {if(number1[i] <= number2[j])number3[k++] = number1[i++];elsenumber3[k++] = number2[j++];}while(i < number1.length)number3[k++] = number1[i++];while(j < number2.length)number3[k++] = number2[j++];return number3;}}

8. Base sort

Base sorting is based on the values of the keywords that are composed, and is sorted by "Allocation" and "Collection. For example, sort playing cards as primary keywords consisting of two data items: Color and face value.

Color: plum blossom <square <red heart <black peach

Nominal value: 2 <3 <4 <... <10 <j <q <k <

To sort a deck of playing cards in the following order:

Plum Blossom 2,..., plum blossom a, square 2,..., square a, red heart 2,..., red heart A, black peach 2,..., black peach.

There are two sorting methods:

1. divide each heap into four stacks based on their colors, collect the stacks, sort each heap in ascending order of their denominations, and fold each heap in descending order of their colors. ---- The "highest priority" (MSD) method.

2. First, 13 stacks are arranged in ascending order of the nominal values, and then collected in ascending order. Then, four stacks are collected in different colors. ---- It is called the "ranking first" (LSD) method.

[Example] set the sequence of record key values to {,} and sort them by base number (LSD ). : F [I] and e [I] Are the headers and backend pointers of the queues with the bid assigned to me.

# Define D 3 typedef struct {int key; float data; int link;} JD key data linkint jspx (jd r [], int N) {/* base sorting represented by chained storage */int I, J, K, T, P, RD, RG, F [10], E [10]; /* P is the subscript of R [], RD, RG is the proportional factor, F [J], E [J] is the first and end pointer of a team whose code is J */for (I = 1; I <n; I ++) R [I]. link = I + 1; R [N]. link = 0; P = 1; RD = 1; Rg = 10; for (I = 1; I <= D; I ++) {for (j = 0; j <10; j ++) {f [J] = 0; E [J] = 0 ;} /* initialize each queue */do/* allocation by bit -- allocated to each queue */{k = (R [p]. key % RG)/RD;/* A Bit Of the key value */If (F [k] = 0) f [k] = P; else R [E [k]. link = P;/* duplicate value -- modify link */E [k] = P; P = R [p]. link;/* get the address of the next node */} while (P> 0); j = 0; /* collect by bit -- Adjust the allocated link */while (F [J] = 0) J = J + 1; P = f [J]; T = E [J]; for (k = J + 1; k <10; k ++) if (F [k]> 0) {R [T]. link = f [k]; t = E [k];}/* adjust link */R [T]. link = 0;/* the end of the chain is 0 */Rg = RG * 10; RD = RD * 10;/* increase one digit */} return (P ); /* return the first address of the ordered linked list */

9. Enumeration sorting

Compare each record item with other items to calculate the number of records smaller than the item to determine the position of the item.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More