Fast sequencing of PHP sorting algorithm (Quick sort) and its optimization

Source: Internet
Author: User
This article mainly introduced the PHP sorting algorithm's fast sorting (quick sort) and its optimization algorithm, combined with the example form to analyze the PHP fast sorting principle, the realization method, and analyzed the various optimization techniques and the operation attention matters, the need friend can refer to the next

In this paper, we describe the fast sorting algorithm of PHP sorting (Quick sort) and its optimization algorithm. Share to everyone for your reference, as follows:

Basic idea:

Quick Sort (Quicksort) is an improvement to the bubbling sort. His basic idea is: by a trip to the sorting of the records into separate two parts, some of which are smaller than the other part of the key words recorded, you can proceed to the two parts of the record to continue the rapid sorting, the entire sorting process can be recursive, in order to achieve the order of the whole sequence.

Basic algorithm steps:

Give me a chestnut:

If the records to be sorted now are:

6 2 7 3 8 9

The first step, create a variable $low point to the first record in the record, $high point to the last record, $pivot as the pivot is assigned the first element of the record to be sorted (not necessarily the first), here:

$low = 0; $high = 5; $pivot = 6;

The second step, we want to move all the numbers smaller than $pivot to the left of the $pivot, so we can start looking for a number smaller than 6, starting from the $high, from right to left, constantly diminishing the value of the variable $high, we find the first subscript 3 of the data is smaller than 6, so the data 3 to subscript 0 Position ($low point), and the data 6 of subscript 0 is moved to Subscript 3 to complete the first comparison:

3 2 7 6 8 9

At this time, the $high decreases to 3$low = 0; $high = 3; $pivot = 6;

The third step, we began the second comparison, this time to become a bigger than $pivot, and to find out from the back. Sliding scale variable $low, found that the data of subscript 2 is the first larger than the $pivot, so with the subscript 2 ($low point to the position) of the data 7 and the point of the subscript 3 ($high point to the position) of the data of 6 to exchange, the data state into the following table:

3 2 6 7 8 9

At this time, the $high decreases to 3$low = 2; $high = 3; $pivot = 6;

Complete the second and third steps we call complete a loop.

The fourth step (that is, opening the next loop), imitating the second step of the process execution.

The fifth step, imitate the third step of the process execution.

After the second loop is executed, the data status is as follows:

3 2 6 7 8 9

At this time, the $high decreases to 3$low = 2; $high = 2; $pivot = 6;

At this point, we found that $low and $high "Meet": they all pointed to subscript 2. So, the first time the comparison is over. The result is as follows, $pivot (=6) The number on the left is smaller than it, and the number of $pivot to the right is larger than it.

Then, the data {3,2} and {7,8,9} on both sides of the $pivot, and then group each of these processes until it can no longer be grouped.

Note: The first time quick sort does not directly get the final result, only the number smaller than K and K is divided into the two sides of K. To get the final result, you need to perform this step again on an array of subscript 22 edges, and then decompose the array until the array is no longer decomposed (only one data) to get the correct result.

Algorithm implementation:

Swap functions function Swap (array & $arr, $a, $b) {  $temp = $arr [$a];  $arr [$a] = $arr [$b];  $arr [$b] = $temp;} Main functions: Function QuickSort (array & $arr) {  $low = 0;  $high = count ($arr)-1;  QSort ($arr, $low, $high);}

In the main function, since the first time the quick sort is ordered for the entire array, it begins $low=0,$high=count($arr)-1 .

Then the QSort() function is a recursive call procedure, so it encapsulates it:

function QSort (array & $arr, $low, $high) {  //when $low >= $high indicates that no further grouping is possible, the correct result can be obtained  if ($low < $high) {    $pivot = Partition ($arr, $low, $high);//Will $arr[$low ... $high] in Split, calculate the pivot value    QSort ($arr, $low, $pivot-1); Recursive ordering of low sub-tables ($pivot left-hand record)    QSort ($arr, $pivot + 1, $high);//recursive Ordering of Gaozi (records on the right of $pivot  }}

From the QSort () function above, we see that the Partition () function is the core of the entire code, because the function is to select one of the keywords, such as the first keyword. And then try to put it in a certain position so that it has a lower value on the left than it is, and the value on the right is bigger than it, we will make the keyword pivot (pivot).

Directly on the code:

Select a keyword in the array so that it is at a certain position in the array, the left value is smaller than it, the value on the right is larger than it, the key is called pivot//make pivot record in place, and return to its location function Partition (array & $arr, $low, $ High) {  $pivot = $arr [$low];  Select the first element of the Subarray as  the pivot while ($low < $high) {//From the ends of the array alternately to the middle (when the $low and $high meet to end the loop) while    ($low < $high & & $arr [$high] >= $pivot) {      $high--;    }    Swap ($arr, $low, $high); Finally encountered a number smaller than $pivot, put it to the lower end of the array while    ($low < $high && $arr [$low] <= $pivot) {      $low + +;    }    Swap ($arr, $low, $high); Finally encountered a larger than $pivot number, put it to the array high-end  }  return $low;  Return to high also line, after all the last low and high are stuck in the pivot subscript}

The entire code is grouped together as follows:

function swap (array & $arr, $a, $b) {  $temp = $arr [$a];  $arr [$a] = $arr [$b];  $arr [$b] = $temp;} function Partition (array & $arr, $low, $high) {  $pivot = $arr [$low];  Select the first element of the sub-array as the pivot while  ($low < $high) {//From the end of the array alternately to the middle while    ($low < $high && $arr [$high] >= $ Pivot) {      $high--;    }    Swap ($arr, $low, $high); Finally encountered a number smaller than $pivot, put it to the lower end of the array while    ($low < $high && $arr [$low] <= $pivot) {      $low + +;    }    Swap ($arr, $low, $high); Finally encountered a larger than $pivot number, put it to the array high-end  }  return $low;  Return high also line, after all the last low and high are stuck in the pivot subscript}function QSort (array & $arr, $low, $high) {  if ($low < $high) {    $ Pivot = Partition ($arr, $low, $high); Put $arr[$low ... $high] in Split, calculate the pivot value    QSort ($arr, $low, $pivot-1);  Recursive ordering of low sub    -Tables QSort ($arr, $pivot + 1, $high);//recursive ordering of Taka tables  }}function QuickSort (Array & $arr) {  $low = 0;  $high = count ($arr)-1;  QSort ($arr, $low, $high);}

We call the algorithm:

$arr = Array (9,1,5,8,3,7,4,6,2); QuickSort ($arr); Var_dump ($arr);

Operation Result:

Array (9) {[0]=> int (1) [1]=> int (2) [2]=> int (3) [3]=> int (4) [4]=> int (5) [5]=> int (6) [6]=> Int (7) [7]=> Int (8) [8]=> int (9)}

Analysis of Complexity:

In the best case, that is, selecting the axis in the middle of the entire array, each time the array is continuously divided into two halves. Thus the optimal time complexity is O (NLOGN) (same as heap sort, merge sort).

In the worst case, the sequence to be sorted is either positive or reverse, then when the pivot is selected, only the edge data is selected, each partition gets a record less than the last partition, and the other is empty, and the final time complexity of the case is O (n^2).

Overall optimal and worst case, the average time complexity is O (NLOGN).

Fast sorting is an unstable sort method.

Because the fast sort is a more advanced sort, and is listed as one of the top ten algorithms of the 20th century .... So the cow break algorithm, we have no reason not to learn him!

Although the algorithm has been very good, but the above algorithm program still has a place to improve, the following specific discussion

Fast Sorting algorithm optimization

Optimization One: Optimize selection pivot:

In the process of previous complexity analysis, we see that the worst case scenario is when the pivot we select is the edge value of the entire sequence. For example, a sequence like this:

9 1 5 8 3 7 4 6 2

According to the custom we select the first element of the array as the pivot, then $pivot = 9, after a loop is divided into {1,5,8,3,7,4,6,2} and {} (empty sequence), that is, each time the division will only get one less record of the sub-sequence, and the other subsequence is empty. The final time complexity is O (n^2). The best case is when the pivot we select is the middle value of the entire sequence. But we can't go through the array every time to get the best value, right? Then there is a workaround:

1, random selection: Randomly selected $low to the value between $high, but this kind of practice some luck feel, in case did not hit success, then the above problem is still unresolved.

2, three take the method: take three keywords first to sort, remove the middle number as a pivot. These three numbers generally take the leftmost, the right and the middle three numbers, or you can randomly take three numbers. The possibility of the pivot being the middle number is greatly improved by this kind of extraction. Since the entire sequence is unordered, it is the same thing to randomly select three numbers and remove three numbers from the right side of the left. and the random number generator itself also brings time overhead, so random generation is not considered.

For this idea, we modify the Partition() function:

function Partition (array & $arr, $low, $high) {  $mid = floor ($low + ($high-$low)/2);  Computes the subscript if of the element in the middle of the array  ($arr [$low] > $arr [$high]) {    swap ($arr, $low, $high);  }  if ($arr [$mid] > $arr [$high]) {    swap ($arr, $mid, $high);  }  if ($arr [$low] < $arr [$mid]) {    swap ($arr, $low, $mid);  }  After three steps above, $arr [$low] has become the median value of three keywords in the left and right end of the entire sequence  $pivot = $arr [$low];  while ($low < $high) {  //alternately scans from both ends of the array (when $low and $high end the loop) while    ($low < $high && $arr [$high] &G t;= $pivot) {      $high--;    }    Swap ($arr, $low, $high); Finally encountered a number smaller than $pivot, put it to the lower end of the array while    ($low < $high && $arr [$low] <= $pivot) {      $low + +;    }    Swap ($arr, $low, $high); Finally encountered a larger than $pivot number, put it to the array high-end  }  return $low;  Return to high also line, after all the last low and high are stuck in the pivot subscript}

The three-digit method for the fractional group has a great potential to ditch the ideal $pivot, but for large arrays is not necessarily, so there is a way to take the nine-digit method ...

Optimization Two: Optimization of unnecessary exchanges:

Now, if there is a sequence to be sorted as follows:

5 1 9 3 7 4 8 6 2

According to the three-digit method we take 5 7 2 in the 5 as pivot.

When you follow a quick sort algorithm, you will find that 5 of the Subscript Transformation order is: 0, 8, 2, 5 and 4, but its ultimate goal is 4 position, in which the exchange is not needed.

According to this idea, we have improved our Partition() functions:

function Partition (array & $arr, $low, $high) {  $mid = floor ($low + ($high-$low)/2);  Computes the subscript if of the element in the middle of the array  ($arr [$low] > $arr [$high]) {    swap ($arr, $low, $high);  }  if ($arr [$mid] > $arr [$high]) {    swap ($arr, $mid, $high);  }  if ($arr [$low] < $arr [$mid]) {    swap ($arr, $low, $mid);  }  After three steps above, $arr [$low] has become the median value of three keywords in the left and right end of the entire sequence  $pivot = $arr [$low];  $temp = $pivot;  while ($low < $high) {  //alternately scans from both ends of the array (when $low and $high end the loop) while    ($low < $high && $arr [$high] &G t;= $pivot) {      $high--;    }    Swap ($arr, $low, $high); Finally encountered a number smaller than $pivot, put it to the low end of the array    $arr [$low] = $arr [$high];  operate with substitution instead of swapping while    ($low < $high && $arr [$low] <= $pivot) {      $low + +;    }    Swap ($arr, $low, $high); Finally encountered a number larger than $pivot, put it into the array high-end    $arr [$high] = $arr [$low];  }  $arr [$low] = $temp;  Replace the pivot value back $arr [$low];  return $low;  Return to high also line, after all the last low and high are stuck in the pivot subscript}

In the above improvements, we use the substitution instead of the intersection operation, because there are many less data exchange, so there is a performance improvement.

Optimization Three: Optimizing the sorting Scheme for fractional groups:

For a mathematical scientist, doctoral tutor, he can conquer the world's difficult problems, can cultivate the best mathematical doctor, when let him to teach pupils "1 + 1 = 2" Arithmetic course, it is not necessarily more than the perennial in the primary school of mathematics teacher taught good. In other words, being overqualified can sometimes become a bad thing.

In other words, a quick sort is a good sorting scheme for larger arrays, but if the arrays are very small, then the fast sort algorithm is better than the direct insert sort (the direct insert sort is the best performance in a simple sort). The reason for this is that a recursive operation is used for fast sequencing, which can be ignored when a large amount of data is sorted, which is negligible relative to its overall algorithm advantage, but if the array has only a few records to order, it becomes a big problem for the cannon to hit the mosquitoes.

So we need to modify our QSort() function:

Specify array length threshold # Max_length_insert_sort 7function QSort (Array & $arr, $low, $high) {  //when $low >= $high Indicates that no further grouping is possible and that the correct result has been obtained  if (($high-$low) > Max_length_insert_sort) {    $pivot = Partition ($arr, $low, $high); /Will $arr[$low ... $high] in Split, calculate the pivot value    QSort ($arr, $low, $pivot-1);//recursively sort the low sub-table ($pivot left-hand record)    QSort ($arr, $pivot + 1 , $high); Recursively sort Gaozi ($pivot to the right of the record)  }else{    //Direct Insert sort    insertsort ($arr);}  }

PS: The above direct insertion sorting algorithm You can refer to: "PHP sorting algorithm direct insertion sort (straight insertion sort)"

Here we add a judgment, when the $high-$low is not more than a constant (there is data that 7 more appropriate, also think 50 is more appropriate, the actual situation can be appropriate adjustment), with the direct insertion of the sort, so that can be guaranteed to maximize the use of these two sorts of advantages to complete the sorting work.

Optimization Four: Optimizing recursive operations:

As you know, recursion has a certain effect on performance, the QSort() function has two recursive operations at its tail, and if the sequence to be sorted is extremely unbalanced (that is, we are not in the middle when we select the pivot), then the depth of recursion is approaching N, not the log₂n of the balance This is not just a matter of speed.

We also know that recursion is achieved through the stack, the size of the stack is very limited, each recursive call will cost a certain amount of stack space, the more parameters of the function, each time the recursive cost more space, so if you can reduce the team rules, will greatly improve performance.

It is said that recursion can be transformed into a circular realization. We are here to use loops to optimize recursion. (for recursion and circulation you can refer to the discussion in the inside, "all recursion can be changed into a loop?") 》)

We QSort() optimize the function trailing recursion:

Specify array length threshold # Max_length_insert_sort 7function QSort (Array & $arr, $low, $high) {  //when $low >= $high Indicates that no further grouping is possible, the correct result has been obtained  if (($high-$low) > Max_length_insert_sort) {while    ($low < $high) {      $pivot = Partition ($arr, $low, $high); $arr[$low ... $high] in Split, calculate the pivot value      QSort ($arr, $low, $pivot-1);//recursively sort the low sub-table ($pivot left-hand record)      $low = $pivot + 1;    }  }else{    //Direct Insert sort    insertsort ($arr);}  }

In the above, we use loops to replace recursion, reducing the previous general recursion amount. The result is the same, but a loop rather than a recursive approach reduces the depth of the stack, improving overall performance.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.