Introduction and implementation of some sorting algorithms

Last Update:2018-12-06 Source: Internet

Author: User

Tags bitset

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The algorithm has to be tested and the sorting knowledge has been reviewed. Therefore, the review content is summarized as follows. For more information, see Introduction to algorithms. The sorting problem is a problem frequently mentioned in algorithms and data structures. In an interview, I asked the specific principle and implementation of quick sorting, and I was suddenly confused, the library provided by various languages usually has a fast sorting algorithm, so I am not concerned about the algorithm principle. However, after this algorithm course ends, I feel that the algorithm content is very important, however, I have learned a lot of mathematical knowledge by referring to the introduction to algorithms, and I am also intimidated by many mathematical knowledge. The focus of the introduction to algorithms is not on algorithm practices. The classic algorithms only provide pseudocode, after that, I made a lot of space to prove the correctness, analyze the complexity, and talk nonsense. Next I will introduce the specific algorithm.

1. Insert sorting

The principle of insertion sorting is that the accessed part is ordered, but remember that the principle is sometimes not written, or the principle is not visible when the code is available, this is because I remember that when I used to take the soft test, I put a pseudo code that changed the insertion sorting in front of me, and I didn't even know it... let's just talk about the code.

Bool insertsort (unsigned int * array, int length) {int I, j; unsigned int K; for (I = 1; I <length; I ++) {J = I-1; // take out the I-th element first, move it backward after it is bigger than it, and then insert it to the position of J + 1, the elements of J + 1 are moved to J.
Key = array [I]; // all the values greater than the key are moved backward.
While (j> = 0 & Key <array [J]) {array [J + 1] = array [J]; j --;} array [J + 1] = key;} return true ;}

Insertion sorting is not often used in practical applications, but some will still be used. For example, when the data volume is small, insertion sorting is the fastest, therefore, some algorithms will use this law in a small scale. I am very worried about insertion sorting because I couldn't see insertion sorting in that soft exam. It was so young at the time, time complexity O (n2), insertion sorting is generally a stable sorting algorithm.

2. Merge and sort

It seems that the principle of external sorting is similar to that of Merge Sorting, but I still haven't implemented it. The sorting algorithms here are all memory sorting algorithms. Merging sorting requires an additional content space, however, the time complexity is merge (nlgn). Although this is a sort, we will see that in actual applications, it is not as fast as sorting. The reason will be analyzed later, merge Sorting is generally a stable sorting algorithm, which has a simple idea and is introduced in the log grouping algorithm in the previous article. The main task is to paste the Code directly in merge.

bool PreMergeSort(unsigned int* array,int begin,int end){unsigned int* arrayAssit = new unsigned int[end - begin + 1];mergeSort(array,arrayAssit,begin,end);delete [] arrayAssit;return true;}bool mergeSort(unsigned int* array,unsigned int* arrayAssit,int begin,int end){if (end == begin){return true;}int mid = (begin+end)/2;mergeSort(array,arrayAssit,begin,mid);mergeSort(array,arrayAssit,mid+1,end);merge(array,arrayAssit,begin,mid,end);return true;}bool merge(unsigned int* array,unsigned int* arrayAssit,int begin,int mid,int end){int i,j,k;i = begin;j = mid + 1;k = begin;while(i <= mid&&j<= end){if (array[i] <= array[j]){arrayAssit[k++] = array[i++];}else{arrayAssit[k++] = array[j++];}}while (i <= mid){arrayAssit[k++] = array[i++];}while (j <= end){arrayAssit[k++] = array[j++];}memcpy(array+begin,arrayAssit+begin,(end-begin+1)*sizeof(array[0]));return true;}

The idea of splitting algorithm design is very important. The next fast sorting algorithm is also based on the idea of splitting...

3. Fast sorting

Quick sorting is a common problem during interviews. Therefore, it is difficult to understand its principles, write pseudocode well, and write a bubble or insert, the memory sorting method is a sorting algorithm that does not require additional space, also known as in-situ sorting, and does not need to copy the array in the Merge Sorting. The main principle is to find a segmentation element, divide the array into the left and right sides, and then recursive, fast sorting is generally an unstable sorting algorithm, directly paste three fast sorting algorithms, which one is easier to remember ~ However, the best performance is the best among the three-digit fetch, but it is a little more complicated,

(1) Each time the last element is taken as the segmentation element in the introduction to Algorithms

The principle of this split array retains two indicator elements, I, j, one of which, if I is the indicator for Traversing elements, the position reserved by another indicator is that the elements in front of the indicator are smaller than the split element.

The steps are the same position for I and j initialization, the starting coordinate, and I traversing backward. When the number is smaller than the split element, replace the current element and the direction position of the J element, at this time, the elements before J are smaller than those of the split element. At the end of the process, the split element is replaced with the elements of J, and the split element is placed in the center.

bool swap(unsigned int& i,unsigned int& j){unsigned int tmp;tmp = i;i = j;j = tmp;return true;}unsigned int partitionLast(unsigned int* array,int begin,int end){unsigned int divide = array[end];int i = begin - 1;int j;for (j = begin;j < end;j++){if (array[j]<divide){i++;swap(array[i],array[j]);}
}
        swap(array[i+1],array[end]);
return i+1;}bool quickSortLast(unsigned int* array,int begin,int end){unsigned int divide;if (begin<end){divide = partitionLast(array,begin,end);quickSortLast(array,begin,divide-1);quickSortLast(array,divide+1,end);}return true;}

(2) Each time the first element is taken as the segmentation element, it seems to be called hore sorting.

This method is different from the first method and does not separate the specific parts. The principle is as follows:

First, use the first element as the segmentation element. At this time, the first element is the rewritable State. Therefore, traverse from the back, find the first element smaller than the split element, and rewrite the first element. At this time, it is the rewritable state, j retains its position. In this case, I is used to traverse from the front, find the first element greater than the split element, rewrite the status of J, and then proceed from J, when I = J, the elements at this position are in the write-only state. The split elements are filled in and split, and then called recursively.

bool quickSortFirst(unsigned int* array,int begin,int end) {int i,j;unsigned int divide = array[begin];i = begin;j = end;while (i<j){while (i<j&&array[j]>=divide)j--;array[i] = array[j];while (i<j&&array[i]<=divide)i++;array[j] = array[i];}array[i]=divide;if (i-1 > begin)quickSortFirst(array,begin,i-1);if (i+1 < end)quickSortFirst(array,i+1,end);return true;}

(3) Fast sorting in three-digit fetch

This method selects the split element more carefully. When selecting the split element, this method compares the size of the Start Element, end element, and intermediate element, and then selects the element of the intermediate size, in addition, it is exchanged with the penultimate element, so that the end element and the starting element are already on both sides, which improves the efficiency and avoids the worst case. This method is more efficient.

unsigned int selectThreeDivide(unsigned int* array,int begin,int end) {int mid = (end+begin)/2;if (array[begin]>array[mid])swap(array[begin],array[mid]);if (array[begin]>array[end])swap(array[begin],array[end]);if (array[mid]>array[end])swap(array[mid],array[end]);swap(array[mid],array[end-1]);return array[end-1];}bool quickSortThree(unsigned int* array,int begin,int end) {int i,j;unsigned int divide = selectThreeDivide(array,begin,end);i = begin;j = end - 1;while (i<j){while (i<j&&array[i]<=divide)i++;array[j] = array[i];while (i<j&&array[j]>=divide)j--;array[i] = array[j];}array[i] = divide;if (i-1>begin)quickSortThree(array,begin,i-1);if (i+1<end)quickSortThree(array,i+1,end);return true;}

Quick Sort time complexity analysis:

The time complexity analysis of quick sorting is also complicated. However, when the Fixed Selection and segmentation elements are applied to ordered elements, the worst case is achieved, the worst case is that there is no element at each split, so the time complexity is O (n2). However, in practice, it is seldom used to sort sorted arrays,

However, quick sorting is a randomization method for this method. Each time the split element is randomly selected. In this way, there will be a worst case crossover, and there will always be good cases.

Still in the range of variance (nlgn.

The randomization quick sorting analysis requires the use of the Randomization Analysis Method to introduce the random indicator variable to calculate the expectation of time complexity. The mathematical proof is more complex, which is omitted here. Understand that the time complexity of randomization quick sorting is merge (nlgn.

One of the reasons for the high efficiency of quick sorting is the high cache hit rate, which does not require frequent cache replacement. This factor will also affect the subsequent benchmark sorting efficiency tests ~

Lower limit of comparative sorting:

The proof based on the lower limit of comparative sorting is proved by the decision tree. The height of the decision tree is Ω (nlgn), so that the lower limit of comparative sorting is obtained.

The proof method is that each comparative sorting ends at the end of the leaf node calculation sorting, but n elements have n in total! Based on the nature of the binary tree, we can see that the number of leaf nodes in height H is equal to 2 h at most, h> = lg (N !), And n! Rewrite to (n/E) n, so h> = nlgn-nlge, so the lower limit is nlgn

Because the lower limit of comparison is nlgn, some sorting algorithms such as quick sorting can achieve good results, and the time complexity is obviously improved. So the following describes some non-Comparative sorting, some can reach linear time.

1. Count sorting

The following code is the counting sorting part of the base sorting.

bool countSort(radixSortNum* array,radixSortNum* arrayAssit,int* arrayCount,int length,int k){int i;for (i = 0;i<k;i++){arrayCount[i] = 0;}for (i = 0;i<length;i++){arrayCount[array[i].forCountSort]++;}for (i = 0;i<k-1;i++){arrayCount[i+1]+=arrayCount[i];}for (i = length-1;i>=0;i--){arrayAssit[arrayCount[array[i].forCountSort]-1].forCountSort = array[i].forCountSort;arrayAssit[arrayCount[array[i].forCountSort]-1].src = array[i].src;arrayCount[array[i].forCountSort]--;}memcpy(array,arrayAssit,(length)*sizeof(array[0]));return true;}

This sorting method is generally a stable sorting method, so in the base sorting, the sub-sorting part required by the base sorting is a stable sorting method.

The counting sorting is pre-processed into the following structure. The elements stored in c' and the position of each element can be directly output. As shown in the following figure, we should know the principle. Specifically, C retains the output position of elements in B. This sorting method is suitable for sorting with a smaller range of K.

2. Base sorting

The stable Sorting Algorithm in the base sorting uses counting sorting. The discussion about the implementation of the base algorithm is mainly about the number of r values to ensure the highest efficiency. The time complexity formula of sorting is given in the book, so the most reasonable theoretical R value is lg (n ). However, the experiment tests the selection of this R value, and finds that the actual situation is not lg (n) to optimize the algorithm. For the selection of specific r values, see the performance comparison conclusion. The bit operation is more efficient than % when the R value of each segment is obtained.

Bool precountsort (radixsortnum * array, int length, int K) {radixsortnum * arrayassit = new radixsortnum [length]; int * arraycount = new int [k]; countsort (array, arrayassit, arraycount, length, k); Delete [] arrayassit; Delete [] arraycount; return true;} bool countsort (radixsortnum * array, radixsortnum * arrayassit, int * arraycount, int length, int K) {int I; for (I = 0; I <K; I ++) {arraycount [I] = 0 ;}for (I = 0; I <length; I ++) {arraycount [array [I]. forcountsort] ++;} for (I = 0; I <K-1; I ++) {arraycount [I + 1] + = arraycount [I];} for (I = length-1; I> = 0; I --) {arrayassit [arraycount [array [I]. forcountsort]-1]. forcountsort = array [I]. forcountsort; arrayassit [arraycount [array [I]. forcountsort]-1]. src = array [I]. SRC; arraycount [array [I]. forcountsort] --;} memcpy (array, arrayassit, (length) * sizeof (array [0]); Return true;} bool radixsort (radixsortnum * array, int length, int numlen, int R) {bitset <32> bitmode (0x0000); int I; If (r = 0) r = (INT) (log (double (length)/log (double) 2); cout <"R:" <r <Endl; for (I = 0; I <r; I ++) {bitmode [I] = 1;} unsigned int mode = bitmode. to_ulong (); // cout <mode <Endl; // cout <bitmode <Endl; numlen = 32; int leftlen = (INT) Ceil (double) numlen/(double) R), j = 0; leftlen + = 1; // move as much as possible once, and exit the loop after 31; // cout <"requires so many moves:" <leftlen <Endl; // cout <"Mode:" <bitmode <Endl; while (leftlen> 0) {If (J * r> 31) {break;} for (I = 0; I <length; I ++) {unsigned int TMP = array [I]. SRC> (J * r); // bitset <32> bittmp (TMP); // cout <"time:" <j + 1 <"bitset: "<bittmp <Endl; array [I]. forcountsort = (TMP) & mode;} precountsort (array, length, (INT) Pow (double) 2, R); j ++; leftlen --;} return true ;}

According to the experiment, the R here is different from the theoretical value. The R value reaches the highest efficiency when it is set to 12. Why is there a deviation from the theoretical value? This may be related to the cache, because when R is set to 12, the counting sorting part in the base sorting can be completed in the cache, so the efficiency is the highest.

3. Sort buckets

One element of the best element corresponds to a pitfall.

Finally, I will post a comparison of the efficiency of some sorting algorithms.

Although the base sorting efficiency is relatively high, it has some limitations. It is more difficult to achieve quick sorting and requires additional space. Therefore, there are many quick sorting options in actual selection.

The document is intended for review notes and sharing purposes. For details, please indicate the source. Thank you ~

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More