Sorting data structures and algorithms two: quick sort

Last Update:2017-11-30 Source: Internet

Author: User

Tags benchmark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The previous article we finished the merge sort, for divide and conquer and recursive thought should have a certain understanding, this article we will introduce the word is considered to be the most error-prone sorting algorithm, but is the favorite use of a sorting method, fast sorting. We must grasp a few key points for the fast sort, which is the selection of datum values and the matters needing attention in the process of recursive thinking. Let's take a look at the following picture to learn about the quick sort process.

We can see that each time the sorting process is to select the first data as the base value, and then set two pointers at the front and end, the pointer corresponding to the data and the reference value to compare, if it is greater than the base value we put it on the right, if it is less than the base value, we will put it on the left. So we'll get an ordered array after n order. Words don't say much, directly on the code, and then we're doing a detailed understanding.

 Public Static voidQuickSort (intArr[],intStartintend) {      inti = Start,j =end; //declaring a Datum value      intkey; //condition to determine whether we continue to recursively      if(I <j) {Key=Arr[start]; //control I and J cyclic Exchange values without interleaving           while(I <j) {               //compare to baseline values from the right                while(Arr[j] >= key && i <j) {J--; }                //if the right value is less than the reference value, we swap the value of the right I positionArr[i] =Arr[j];  while(Arr[i] <= key && i <j) {i++; }                //if the value on the left is greater than the base value, we swapARR[J] =Arr[i]; }          //datum values are returned to bitsArr[i] =key; //continue to sort recursivelyQuickSort (arr,start,i-1); QuickSort (Arr,i+ 1, end); }}

Some friends after reading the code and may still compare the circle, what does this mean, or is still wondering why some places to do here, here I say to everyone.

One: The meaning of the outer if Judgment:

As we see in the code, this is to control whether or not to continue the sort at the end of the day, it is impossible to go on indefinitely, and the condition of termination is that when our start and end are equal, we can draw an understanding that we are constantly narrowing the scope of each order.

Second: Why the selection of the reference value will be assigned within the IF condition:

We can assume that we have been to the outside of the assignment, when we encounter the datum that you selected is exactly the largest data, the position of the datum after the first order is not the last one, then we do the Gay processing, and then once again recursive ordering, You directly use the original array to assign the value, is not to cause the arrayindexoutofboundsexception array subscript out of bounds this exception, if you can, it is best for you to demonstrate, if you want to change the block can also, then we will be the entire judgment if the following code, Avoid the subscript out of bounds when assigning values.

if (I >= j) {  return;}

Three: What is the meaning of the outer while condition:

Let's see if we're going to exchange data for the I position and J position. Is it the end of the day, that must be unrealistic, the core idea of the fast sort is that after a benchmark is selected, the data on the left is necessarily bigger than it is, and the data on the right is smaller than it, so we must make a judgment on all the data So we're going to nest two while on the right, and always find a number that's smaller than the base value, and then the left side starts, so repeat until I equals J.

Four: Our order is not to start from the right to look for when the value is exchanged without using the median value will not result in data overwrite:

This is a key issue, so be sure to remember that you have to start looking from the right, because our benchmark is on the left, and if we start from the left, then we will have the data overwritten for the first time, and then the data will be overwritten. It is also possible to explain why we do not use the median value and does not cause data coverage, just imagine that our first exchange value is not necessarily to overwrite the benchmark value, and then the first one is smaller than the baseline data will have two, when the left to start the action, when it found a higher than the benchmark value of the data is not exchanged again, Cover off a stay on the right side of the data, so that combined with the benchmark value of the return, is not able to recover data, it is unfortunate to the left to the right side of the cursor did not find a larger, this time to stop the benchmark value is also the same reason. A friend who doesn't understand can understand it in drawing.

Five: So much I < j is not going to look superfluous:

Not at all, because the first time we judged I<j was used to determine if recursion was still needed, the inner while loop I < J jumped out. So this time we will ask, then whether the inner i<j of the inner while loop can be omitted, the answer is obviously not, so the result can be imagined that I and J keep looking down. So the innermost i < J can be omitted, this can be considered, we will finally be the benchmark value, if omitted this I < J will cause I and J Last stop position difference one, the whole algorithm will crash.

　　　　If we all understand the future we consider whether this algorithm has a problem, we choose the first data each time as our benchmark value, then if the data is exactly the maximum, or the minimum value, then what is the meaning of this algorithm design? What can we do to avoid, some people say we can take random numbers, but it is worth our thinking is that the core of the algorithm is based on the time complexity and spatial complexity of the consideration, if each time need to randomly select a number as the benchmark value will affect the efficiency of our algorithm. Here I introduce one of the accepted methods of comparison, that is, the ternary method:

In fact, this name may sound good, very tall on the feeling, in fact, that is the illusion, the so-called ternary take the method is to start the data, the end of the data, the median value out, choose three middle value as the benchmark value, this is called the ternary method. I won't show you the code!

Calculation of time complexity:

　　　　　　1: As mentioned above, time complexity will have the best situation, that is, the right to select the middle value

In the best case, the partition is divided evenly every time, if you sort n keywords, the depth of its recursive tree is [Log2n]+1 ([x] represents the largest integer not greater than x), that is, only recursive log2n times, it takes time to T (N), the first time partiation Should be to scan the entire array once, do n times comparison. Then, the obtained pivot divides the array into two, then the time of T (N/2) is required (note the best case, so split in half). So the division continues, there is the following inequality inference:

This shows that in the optimal case, the time complexity of the fast sorting algorithm is O (NLOGN).

2: Worst of all, that's what I'm talking about. The maximum and minimum values are selected:

Then look at the worst-case fast-track, when the sequence to be sorted is in a positive or reverse order, and each partition has only one sub-sequence that is less than the last one, noting that the other is empty. If the recursive tree is drawn, it is a diagonal tree. You need to perform a n‐1 recursive call at this time, and the first Division I need to go through the N‐i keyword comparison to find the first record, that is, the position of the pivot, so the number of comparisons is

The resulting time complexity is O (n^2).

　　　　　3: General Conditions:　　　　

Finally look at the general situation, the average situation, the PIVOT keyword should be in the position of K (1≤k≤n), then:

　　　　Wind rustling easy water cold, learn the algorithm is not difficult. Million road hardships only for the resistance, the desperate can be. Learning algorithm must think more why, perhaps you encounter the problem will not I say that the points, also perhaps you did not encounter such a problem, diligence desolate in the hippie, something has nothing to write again, form boring on the dictation or think about the process of the algorithm, I believe that nothing can stop you, you are afraid of serious.

Sorting data structures and algorithms two: quick sort

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More