The pivot element selection, code details, and its standard implementation of the partition function in fast sorting

Source: Internet
Author: User

Many written interviews like to examine the fast line, so that you write a handwriting is not what. I learned this very early, and the process of fast sequencing is very clear. But the recent attempt to hand-write, found that the details of the algorithm is not accurate enough, many places even just an image in the brain, and did not understand its true nature intentions. So today combined with the "data structure" (Min), and "introduction to the algorithm" to Explore.

First give a quick sort of Min version of the implementation (in fact, this part of the partition is also the introduction of the algorithm study questions implementation, the details may be different):

1  public classQuickSortImplementssortable {2 3      public voidSortint[] Array) {4         if(array = =NULL|| Array.Length = = 0) 5             return;6QuickSort (array, 0, array.length-1);7     }8 9     Private voidQuickSort (int[] array,intLeftintright ) {Ten  one         if(right <=left ) a             return; -         intPivot =partition (array, left, right); -QuickSort (array, left, pivot-1); theQuickSort (array, Pivot + 1, right); -  -     } -  +     Private intPartitionint[] array,intLeftintright ) { -  +         intPivot =array[left]; a         inti =left ; at         intj =right ; -  -          while(i <J) { -  -              while(i < J && array[j] >=Pivot) -j--; in              while(i < J && array[i] <=Pivot) -i++; to  +             if(i <J) { -                 intTEM =array[i]; thearray[i] =array[j]; *array[j] =tem; $             }Panax Notoginseng         } -  the         intTEM =array[i]; +array[i] =array[left]; aarray[left] =tem; the         returni; +     } -}

In the above section we focus on the partition Method.

The idea of this method is to find a PIVOT element (the method is to find the first element, in fact, great article but here to simplify the description), and then from both sides of the array (specific from where to where the number of incoming parameters determined) to generate two pointers I and j, each time found that the left element is greater than the pivot element i stopped, The element on the right is smaller than the pivot element j, and the position of this two number is Exchanged. Until two pointers i,j Meet. Then insert the pivot element into the position of i, which is where it should be.

The final result is that the [left,right] portion of the array is rendered with 2 parts, the final position of the pivot element is less than or equal to the pivot element at the left, and the right is greater than or equal to the pivot Element. The pivot element is inserted in an absolutely correct position (and will not change in the future).

Start by saying the important points of this algorithm:

1, the choice of pivot element

Speaking of this, in fact, there is a lot of knowledge, why fast sorting fast, It is in the positioning of the hub element of the exact location of the same time an operation to make the array into two Parts. In fact, we take this part off, each time just to determine the location of a PIVOT element (and no longer adjust its left and right elements), this is a sort of, but a lot less efficient is, O (n^2). Whatever input is the complexity of the ALGORITHM. So the pivot element is well chosen, which affects the size of our quick sort of the specific optimization (again, a constant parameter of the Algorithm's complexity). The most optimized result is O (n*log2 (n)).

Here is still to explain the problem, the algorithm complexity of the specific parameters. theoretically, the complexity of the algorithm does not pay much attention to the specific parameter size: o (2n) can be written as O (n). The specific principle is determined by the definition of the complexity of the algorithm, which simply means that when the size of n is large to a certain level, the fundamental difference between the 2 function complexity is not the size of the constant parameter, but its essence. For example O (10000*n) <o (n^2), because when n is large enough, it inevitably causes the latter to be larger than the Former.

So why the optimization of the situation (hub element to get the best Case) log base is 2? Very simply, the best thing to do with a quick sort is that every time you take the "midpoint" of the current array. Imagine: each round of partition algorithm complexity is O (n), and then each time the original array is divided into two equal parts, so there will be log2 (n) partition (in fact, and a node is n the height of the full two-fork tree is log2 (n), The final algorithm is the complexity of O ( N*LOG2 (n)).

So what if we don't have two points every time? Of course the real situation is not really do this, then we consider a poor situation each time the original array is divided into 1:9 division. At this time the number of recursive layer is log (10/9) (n), the specific proof can see the previous two fork tree description, draw a diagram can understand. The final algorithm complexity is O (n*log (10/9) (n))--->o (n*log (n)). That is, even if the division is so bad every time, the final result is O (n*log (n)) this magnitude, but there is a hidden parameter. And actually even 1:99 is the Result. and the actual operation of the division is certainly not so exaggerated and close to the two Points. So what will really lead to a worst case scenario? You are not good to choose the pivot element, each time the smallest (big) of the One. For example, every time you select the first element of a hub, and this array is exactly a non-descending array, then the algorithm complexity is O (n^2).

so, in the end, we know that the rapid sequencing of the pivot element selection is not good, it completely lost the fast sort of "fast" Advantages. There are two best ways to Choose:

1, a random one in the current array

2, the first element in the current array, the middle element, and the last element that is centered on the size of the selected One.

This will effectively avoid picking one of the smallest (big) elements at a Time. of course, My code uses a rather stupid approach (take the first element at a time).

2. Control of the border

This part is actually the most important point that I want to express today, the front is to avoid fraught let others also choose the first Element.

The so-called boundary control refers to, I and J when the two pointers stop, after the end of the pivot element should be inserted where. Why is this part important? Because many of the details of this quick sort are actually hidden inside.

For example, the partition part of the above code, one question is why first j--instead of first i++. In fact, this involves another ultimate problem: when I and J move toward the middle of the process of stopping, what is the state of the stop, and then how to determine the location of the hub element (i?,j?,i+1?,i-1?).

first, Here is an example of the process of I and J Traversal: Zhuazhuangding. Imagine that the [left, right] part of this array is actually a presence, two groups I and J come to zhuazhuangding, they have their own way of judging a person who is supposed to be a group, if I find out that I actually found a J person, I'm sorry, you wait, I swapped you and J to find the I Person. Then this I and J find their own team to grow up, to the end, followed by a group of i, J followed by a group of people J. What will happen in the end? It was the meeting of I and J. The important meaning of the partition that this process reveals is the partitioning of arrays.

Then explain in detail why first j--instead of i++, here The logic is a process of analysis, tied up.

Let's assume that if it's i++ before J--see what Happens.

Assuming that the current pivot element is 4,i traversed to 5,j, there is no other element traversed between 3,5 and 3. This time I stop at 5,j first stop at 3, swap. Then I further, i++ found i==j, because of the limitations of the cyclic conditions (i < j), The next step is to jump out of the entire loop. Is it time to put the pivot element in the position of I (also the position of j)? no, it should be put to i-1. Why is it? Because this time I was in the field of J. J The reason for this position in J is that there has been an exchange before, at this point J this position number must be greater than the pivot element (note that the hidden danger here!) ), if i==j, then swap the pivot element and i, then swap a number greater than the pivot element to the beginning of the array (note that I use the first element as the hub element here). That's not going to go wrong. So it should be exchanged to i-1 this Position.

however, the most critical question comes, is the above approach right? In fact, It's still wrong.

Because the exchange to i-1 this position is the premise that J is indeed a "valid" array of the second part of the array (the value is greater than or equal to the pivot element). Take one of the simplest examples: {9,1,2,3} this array. I start at 9,j in 3, if I first move, I will run directly to the position of 3 and J is equal (the array subscript is 3). But who should exchange the pivot element at this time? Obviously I rather than i-1. Why? Because J at this point is an "invalid" number. It is not traversed at all, so it is possible that I may also be i-1.

So the problem is more serious, so writing code (putting the loop of I in front of J's loop) can cause such a problem. (in fact, There are a lot of quick sorting on the internet now, it's just a matter of testing data.)

So why write J in front and walk through J first? Notice that I is not the beginning of the left+1, but from the left. If you start from j, whether it is in the middle of the meeting, or a J down directly to the left, it is possible to direct the hub element and I-phase Exchange. Middle Meet: J will first enter the field of I and then stop, at this time i==j, are less than equal to the pivot element, and this algorithm hub element is taken from the first value, the current is less than or equal to the value of the pivot element Exchange to the front is possible, so I can stop to Exchange. If in the last encounter, that is, J straight all the way down to i, this time to find that i<j is not satisfied, still exit the loop, or can I and the Hub Yuan Exchange. This unifies the index of the last interchange hub Element.

In the end, I start with the left, the first decrement j, you can let I start is "effective" (can be understood as I first is already traversed), a bit similar to the role of a sentinel, to ensure that the timely j-road will also be exchanged correctly. This is actually a study questions of the introduction of Algorithms.

In addition, this kind of partition is implemented as Follows:

1 Private intPartitionint[] array,intLeftintright ) {2 3     inti = left-1;4     intj =left ;5     intPivot =array[right];6 7      for(; J < right; J + +) {8         if(array[j] <Pivot) {9i++;Ten             intTEM =array[i]; onearray[i] =array[j]; aarray[j] =tem; -         } -     } the  -     intTEM = array[i + 1]; -Array[i + 1] =array[right]; -array[right] =tem; +  -     returni + 1; +}

This is the approach to the text of the introduction to algorithms, which is characterized by a clear code. The idea is to traverse backward, without traversing to a new element, first screened, belonging to less than the hub element, put him to the left part of the array, otherwise right part. How do you give it to left? That is to let left increase, this time the left part will hold a element that does not belong to it (greater than or equal to pivot), when this element and only the element is traversed to the exchange of the line, and if this element belongs to the right part, then do not do any processing.

The process of discrimination is also the process of thinking. Only the real understanding of the fast line can be "handy" to write, rather than back Down.

The pivot element selection, code details, and its standard implementation of the partition function in fast sorting

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.