Quick Sort (QuickSort)

Source: Internet
Author: User
Tags square root

Why call Quick Sort

This title is ambiguous, each sort has its own name, some inventors + sort (shell sort), there is a step name + sort (Insert sort) ... The quick sort is named after its attribute + sort (this is not nonsense). Then I'll change the meaning of the title:
Quick sort why are you so close to figuring this out? First you need to understand the comparison-based sorting model: Decision Tree

The position relationship of the input with size n is n! Kind of possible. The job of sorting algorithms is in all n! Possible to find out what kind of input it is.

Comparison-based sorting algorithms have only a comparison of the tools at hand, by constantly comparing pairs of elements to exclude impossible permutations (or leaving possible permutations).

The results of the comparison can only be > and < (regardless of =), so each comparison operation always has the opportunity to make half of the possible stay. Therefore, the lower bound of the comparison-based sorting algorithm is ω(lgn!) =ω(NLGN).

Perhaps you might ask why the insertion order only costs O (n) when the data is in order.

Spending O (n) is true, but this is the shortest one in the decision tree, and the nether is at least how many comparisons are done for the worst case scenario.


The reason for fast sorting is that the partitioning process is largely informative and can quickly eliminate the possibility of possible permutations.

For example, take the best case-the main element is the median (that is, evenly dividing the interval into two halves), then get a N/2 element is smaller than the primary, N/2 elements larger than the main element.

So the remaining permutations may be: (N/2)! * (N/2)! = n! /2^ (N/2), and this process only costs O (n).

When compared with O (1), it is only possible to have a square root of 2 1, which is also a geometrical decline.


Of course, it's also relative: Take the insertion sort, why it's so slow, because every time you insert an element, the information you get is very small, and the number of excluded permutations may be minimal.

For example, the insertion of the N/2 element, the cost of O (N/2), the remaining arrangement may be n! /(N/2)! The arrangement before inserting may be n! /(N/2-1)! Species, equivalent to the O (1) comparison only ruled out the possibility of 2/n, this efficiency is far from the speed of sorting than the!
Therefore, the reason why the fast sorting is efficient is that the model can quickly determine the relationship between the elements by using the principal component partition.
The advantage of the Hoare partitioning algorithm for the Hoare partitioning algorithm is that the constants are very low first (the elements that are already in the correct interval are not made redundant), and the elements equal to the main element are evenly divided on both sides.
/***************************************************************    function: Random selection of the parameters of the hall partitioning algorithm for the main element    : to [Low, high] The range of data partitions, range is the number of elements of the interval ***************************************************************/int* hoarepartition (int* Low, int* high, int range) {///    random selection of the principal    int pivot = ((int) (((double) rand ()/rand_max) * range) + low);    low--;    while (true)    {        while (* (++low) < pivot);    Scan left interval        while (* (--high) > Pivot);   Scan Right Zone        ///When both sides are scanned for elements that do not belong to their own interval        if (Low < high) {/////            when two intervals are not crossed, it is indicated that there are two elements falling in the wrong interval, which can            be exchanged int tmp = *low;            *low = *high;            *high = tmp;        }        When the interval intersects, the Hoare partition only guarantees elements in [0....pivot] in the element <= [PIVOT....N]//        and does not determine the location of the principal, so only two intervals of the size relationship are obtained, with no principal elements separated        from each other else return low;}    }
It is particularly important to note that the Hoare partitioning algorithm does not allow the main element to be separated from the middle, butThere are two intervals for determining the size of a relationship. Therefore, the selection of interval range is different in the fast sort recursive function.
The number of comparisons expected here is the number of times that the entire sorting algorithm is running, all the scanning interval. That is, the size of the following CNT is expected:
int cnt = 0; Number of comparisons int* hoarepartition (int* Low, int* high, int range) {int pivot = * ((int) ((double) rand ()/rand_max) * range) +    Low);    low--;    while (true) {while (* (++low) < pivot) cnt++;  Occurrence comparison cnt++;   Test failure is also a comparison while (* (--high) > Pivot) cnt++;  Occurrence comparison cnt++;            Test failure is also a comparison if (low < high) {int tmp = *low;            *low = *high;        *high = tmp;    } else return low; }}/*************************************** function: Quick Sort function Parameter: Sort the average time complexity of the data in the [low, high] range: O (NLGN) ********************* /void QuickSort (int* Low, int* high) {int range = high-low;////////////When the interval has more than 1 elements, partition if (  Range > 1) {int* pivot = hoarepartition (low, high, range); Returns the primary address QuickSort (low, pivot);    Fast sequencing of the left interval quickSort (pivot, high); }}/*********************************** function: Test the number of comparisons ***********************************/voiD Testquicksort (int* A, int n) {cnt = 0;    Number of initialization quickSort (a, a + N); cout << Endl << cnt << Endl;}
The expected constants and algorithms here are not the same as the introduction of the algorithm, as this is expectation analysis for Hoare partitioning, but the introduction to analytic methods and algorithms is the same. The first thing you have to assume is that the principal being returned is the principal element used for partitioning. Otherwise it is too difficult to analyze. Here the algorithm introduction above the derivation method: Any pair of gap-spaced elements (i,j), in the sorting process will be compared to K-times probability of 1/(Gap + 1) ^ (k-1) * 2/(GAP + 1), and the gap far from the element pair has (N-GAP) pair, so get the following series:
The actual value of CNT should be larger than the above calculation, because the Hoare partitioning algorithm selected in the main element value will be compared with their own (the above series does not consider this point), so should also add the number of the main element, which is also a probability problem (if the main element is the minimum value may make the interval unchanged), However, an estimate can be made quickly: O (n), since this is a binary tree, each internal node is selected once, and the internal node = leaf node-1. So the expected number of times is: 2nln (n) + n Below is my test data (the input elements are different):

The advantages of the three-digit selection from the interval randomly selected three numbers (can be repeated selection of the same element), intuitively quickly know that the more the size of the middle of the number of the more easily become the main element, also to a large extent, so that the division more evenly.
/*************************************************************** function: Three the hall partitioning algorithm parameter in the number: the data partition in the [low, high] range, Range is the number of elements in the interval ***************************************************************/int* hoarepartition (int* Low, int*    High, int range) {///three-digit int a = * ((int) ((double) rand ()/rand_max) * range) + low);    int b = * ((int) ((double) rand ()/rand_max) * range) + low);    int c = * ((int) ((double) rand ()/rand_max) * range) + low); int pivot = a > B?    (b > C? B: (A > C. c:a)): (A > C? A: (B > C? c:b));    low--;    while (true) {while (* (++low) < pivot);   Scan left interval while (* (--high) > Pivot);            Scan right zone///When both sides are scanned for elements that do not belong to their own interval if (Low < high) {/////When two intervals are not crossed, it is indicated that there are two elements falling in the wrong interval, which can be exchanged            int tmp = *low;            *low = *high;        *high = tmp;        }////When the interval intersects, the Hoare partition only guarantees elements in [0....pivot] in the element <= [PIVOT....N]//And does not determine the location of the principal, so only two intervals of the size relationship are obtained, with no main element separated  else return low;  }} 
The following is the specific probability distribution: Because the element of the K-large is chosen as the probability of the main element (6 (k-1) (n-k) + 3n-2)/n^3, the probability distribution function y = 6x (1-x) is obtained for the limit infinity of N:
If the definition of a "good" partition as in the introduction of the algorithm is the main element in [N/3, 2N/3], then we can know by integral that the probability is 13/27 (nearly half of the probability), compared with the 1/3 of the three number taken in a lot. There is absolutely no analysis of this uneven probability distribution, but intuitively it is understandable, and below I send the test data (the same elements as above are different):
Ascension is good, but there is nothing perfect, because the "take" step in the three count takes a lot of extra time. However, this drawback can be remedied by some, followed by the use of insert sorting.
Tail recursion optimization in a fast-sorting recursive tree The information of the nodes is not required to be saved, so a parent node can be left in place of one of the child nodes to continue to recursively:
/***************************************    function: Tail recursive optimized quick sort    function: Sort data in [low, high] range    expected run time: O (2NLGN) * * * * * * * /void Quicksortroutine (int* Low, int* high) {    int range;      Number of interval elements/    //When the interval has more than 1 elements (    range = high-low) > 1)///change to a    looping statement so that the parent node can act as a child node in the next operation    {        int* pivot = hoarepartition (low, high, range);  Returns the primary address        quicksortroutine (low, pivot);///////For the left interval fast sort low        = pivot;    Note Interval range    }}
You can choose to replace the "larger" child recursion so that the depth of the recursive tree control within the LGN, but the rapid ordering is based on the probability of eating, this will increase the cost of comparison, I will not write. There is also a little bit of optimization space, that is, to maintain an array of information about the preservation interval, the data stored in the interval to be partitioned, so it is not recursive at all. However, additional space consumption will be added. Make the "leaves" thicker like the previous three-digit cost is very large, and the recursive overhead is also very large, so we can quickly sort of "touch the bottom" condition of the restrictions relaxed, and finally to the entire basic ordered array application insertion sort. Then get a complete quick sort:
#define FACTOR 37///Leaf width/*************************************************************** function: Three number of Hall partitioning algorithm parameters: to [Low , high) in the range of data partitions, range is the number of elements ***************************************************************/int* hoarepartition (int* Low, int* high, int range)    {///three-digit int a = * ((int) ((double) rand ()/rand_max) * range) + low);    int b = * ((int) ((double) rand ()/rand_max) * range) + low);    int c = * ((int) ((double) rand ()/rand_max) * range) + low); int pivot = a > B?    (b > C? B: (A > C. c:a)): (A > C? A: (B > C? c:b));    low--;    while (true) {while (* (++low) < pivot);   Scan left interval while (* (--high) > Pivot);            Scan right zone///When both sides are scanned for elements that do not belong to their own interval if (Low < high) {/////When two intervals are not crossed, it is indicated that there are two elements falling in the wrong interval, which can be exchanged            int tmp = *low;            *low = *high;        *high = tmp; }////When the interval intersects, the Hoare partition only guarantees elements in [0....pivot] in the element <= [PIVOT....N]//And does not determine the location of the main element, so only two intervals of the size relationship are obtained, with no principal element in the middleelse return low; }}/*************************************** function: Tail recursive optimized quick sort function: sort data in [low, high] range expected run time: O (NLGN) ******************      /void Quicksortroutine (int* Low, int* high) {int range;        Number of interval elements///When the interval has more than 1 elements (range = high-low) > FACTOR)///change to a looping statement so that the parent node can act as a child node in the next operation {  int* pivot = hoarepartition (low, high, range); Returns the primary address Quicksortroutine (low, pivot);    The left interval is quickly sorted by low = pivot; Note the interval range}}/******************************************************************** function: Optimize a little bit of the insertion sort function: For [low, high] within The data is sorted ********************************************************************/void improvedinsertionsort (int* Low,    int* high) {///Because the minimum value is first, insert the ++low directly from the third element;  while (++low < high) {int tmp = *low;/////To save the element to be inserted into a temporary variable int* destpos = low; Calculate Insert seat///put the first test separately if (* (--DESTPOS) > tmp) {do {* (destPos + 1) = *destpos;     }while (* (--DESTPOS) > tmp);     Test whether the previous one is the target location * (Destpos + 1) = tmp; The last Test failed to make destpos smaller than the actual 1}}}/********************************** function: Full version quick sort *********************************    */void QuickSort (int* Low, int* high) {///Set random number seed Srand (Time (nullptr));    For quick sorting, the leaf width is factor quicksortroutine (low, high);   Find the minimum value put at the beginning as the insertion Sort sentinel node int* minpos = low;    The position of the minimum value int* lastpos = low + FACTOR;    for (int* i = low + 1; i < Lastpos; i++) if (*i < *minpos) Minpos = i;    int tmp = *low;    *low = *minpos;    *minpos = tmp; Finally insert sort improvedinsertionsort (low, High);}
So how to choose the width of the leaves. Generally 8 to 20 are good, but here is a three number to take in so the constant is larger than the normal random selection of the main element, so here is a two-digit number can be. The above 37 is also my subjective choice, like life in the Big Bang Sheldon said: 73 is the most wonderful number, because 73 is the 21st prime, in turn 37 is exactly the 12th Prime, and 21 is exactly equal to 7 and 3 of the product, and 73 converted to binary can be 1001001, It's the same as reading and reading backwards. It feels like a high-end writing.
Please comment below if the PostScript is incorrect or has a better method. Thank you.

Quick Sort (QuickSort)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.