Algorithms and sorting (6)

Source: Internet
Author: User
Tags benchmark sorts

Quick Sort)

1. algorithm ideas

Quick sorting is a sort by Division and exchange proposed by C. R. A. Hoare in 1962. It adopts a sub-Governance Policy, usually called divide-and-conquermethod ).


(1) Basic Idea of divide and conquer Law

The basic idea of the division and control law is to break down the original problem into several subproblems with smaller sizes but similar structures as the original problem. Recursively solve these subproblems, and then combine the solutions of these subproblems into the solutions of the original problem.


(2) Basic Idea of fast sorting

Set the unordered area to R [low .. High], and describe the basic idea of fast sorting using the division and control method as follows:
① Decomposition:

In
Select a record in R [low... high] as the benchmark to divide the current disordered zoning into two smaller subintervals: Left and Right. R [low... pivotpos-
1) and R [pivotpos + 1 .. High], and make the keywords of all records on the left subinterval less than or equal to the keywords of the benchmark record (which may be recorded as benchmark)
Keyword. key, the key of all records in the subinterval on the right is greater than or equal to the keyword. Key, while the benchmark record is located in the correct position (pivotpos), it does not
Follow-up sorting is required.


Note:

The key to division is to determine the location of the benchmark record, TPOs. The division result can be simply expressed as (note that partition = R [pivotpos]):

R [low... pivotpos-1]. Keys ≤ r [small TPOs]. Key ≤ r [small TPOs + 1 .. High]. Keys


Here, low ≤ TPOs ≤ High.
② Solution:


Use recursive call to quickly sort the Left and Right subintervals R [low... pivotpos-1] and R [small TPOs + 1 .. High.
③ Combination:


Because when the two recursive calls in the "solving" Step end, the left and right subintervals are ordered. For quick sorting, the "Combination" step does not need to be done, and can be considered as a null operation.


2. Quick Sorting Algorithm quicksort


Void quicksort (seqlist R, int low, int high)

{// Fast sorting of R [low... high]

Int pivotpos; // the location of the benchmark record after Division

If (low
Extends TPOs = partition (R, low, high); // divide R [low... high]

Quicksort (R, low, pivotpos-1); // recursively sorts the left Interval

Quicksort (R, pivotpos + 1, high); // recursively sorts the right range

}

} // Quicksort

Note:

To sort the entire file, you only need to call quicksort (R, 1, n) to sort the R [L. N.

3. Partition Algorithm

(1) Simple Partitioning Method

① Specific practices


Step 1: (initialize) set two pointers, I and j. Their initial values are the lower and upper bounds of the interval, that is, I = low and I = high; select the first record R [I] (R [low]) in the unordered zone as the benchmark record and save it in the variable comment;
Step 2: Make J scan left from high until 1st keywords are found less than limit. key record R [J], move R [J]) to the position indicated by I, which is equivalent to R [J]
Exchange with benchmark R [I] (I .e., benchmark) to make the keyword smaller than the benchmark keyword. the key record is moved to the left of the benchmark. After the exchange, the record in R [J] is equivalent to a limit. However
The I pointer starts scanning right from the position I + 1 until 1st keywords are found to be greater than limit. key record R [I], move R [I] to the position indicated by I, which is equivalent to switching
R [I] and benchmark R [J] Move the record with a keyword greater than the benchmark keyword to the right of the benchmark. After the exchange, the R [I] is equivalent to storing the benchmark; then let pointer J start from position J-1 to left
Scanning, which changes the Scanning direction alternately from the two ends to the center until I = J, I is the final position of the benchmark, putting the cursor in this position completes a division.


② Division process

Specific changes during a Division [see animation demonstration]


③ Division algorithm:


Int partition (seqlist R, int I, Int J)

{// When calling partition (R, low, high), divide R [low... high,

// Return the location of the benchmark record

Recetype benchmark = R [I]; // use the first record of the interval as the benchmark'

While (I <j) {// scanning from both ends of the interval to the middle until I = J

While (I <J & R [J]. Key> = keys. Key) // equals

J --; // scan from the right to the left to find records whose 1st keywords are smaller than keyword. Key. R [J]

If (I <j) // indicates the keyword <keyword. Key


R [I ++] = R [J]; // equivalent to switching R [I] and R [J]. After switching, the I pointer adds 1

While (I <J & R [I]. Key <= keys. Key) // equals


I ++; // scan from left to right to find records with 1st keywords greater than keywords. Key R [I]

If (I <j) // indicates that R [I] is found, so that R [I]. Key> keys. Key


R [j --] = R [I]; // equivalent to switching R [I] and R [J]. After switching, the J pointer is reduced by 1.

} // Endwhile

R [I] = benchmark; // the benchmark record has been located.

Return I;

} // Partition

4. Fast sorting and execution process
The entire process of Quick Sort execution can be described in a recursive tree.


Analysis:
(1) The envelope line with arrows is shown in the Recursive Execution route.
(2) square brackets on the left of each node in the recursive tree indicate the range to be sorted. the keywords in the node are the benchmark Keywords of the Division.
Note:
The subinterval corresponding to the leaf node has only one keyword and does not need to be divided. Therefore, there is no benchmark keyword in the leaf node.
(3) The left and right subintervals obtained after division are respectively marked in the left square brackets of the left and right child nodes of the node.
[Example] The left square brackets ([,]) at the root node indicate the keywords to be sorted, and the 49 in the root node indicate the keywords of the selected benchmark record, the dividing result is [, 13] 49 [, _], and its left and right subintervals are marked on the left of the two children at the root node.
(4) The content in the right-side parentheses of each branch node indicates the result returned after the sorting process of the left-side interval of the node ends. It is the sequence of keywords obtained after sorting the corresponding intervals of left and right children and placing the sorting results of left and right children before and after the keywords of the branch node.
[Example] After sorting the intervals of the left and right children of the branch node 76, the results are (49 _, 65) and (97) respectively, and place them in the front and back of 76, ). This is the result of sorting the left side of node 76.
(5) The execution sequence of the algorithm is the arrow sequence in the recursive tree. In fact, when Division operations are regarded as Access Node operations, the execution process of quick sorting is equivalent to traversing its recursive tree in order.
Note:
Any recursive algorithm can use a recursive tree to describe its execution process.

5. quickly sort the status changes after each Division
[49 38 65 97 76 13 27 49]
// Initial keyword
[27 38 13]
49 [76 97 65 49]
// After the division is completed for 1st times, it corresponds to Layer 2 of the recursive tree
[13]
27 [38]
49 [49 65]
76 [97]
// After the unordered partitions in the previous layer are divided, the corresponding recursive tree corresponds to the layer 3rd
13 27 38 49 49 [65]
76 97 // After the unordered partitions on the previous layer are divided, the corresponding recursive tree layer 4th
13 27 38 49 49 65 76 97 // final sorting result

6. Algorithm Analysis
The time of quick sorting is mainly used in partitioning operations, and the interval of K length is divided. A comparison of K-1 times keyword is required.


(1) Worst time complexity
The worst case is that the benchmark selected by each division is the record with the smallest (or largest) keyword in the currently unordered area, the division result is that the subinterval on the left of the benchmark is null (or the subinterval on the right is empty), and the number of records in the subinterval obtained from the Division is not empty, only one fewer record count than the unordered partition before division.
Therefore, there must be n-1 division for quick sorting. the start time of division I is the time zone length n-I + 1, the number of comparisons required is n-I (1 ≤ I ≤ N-1), so the total number of comparisons reaches the maximum value:

Cmax = N (n-1)/2 = O (n2)

If, based on the partitioning algorithm given above, the first records in the current unordered area are taken as the benchmark each time, when the records of the file are arranged in ascending order (or descending order, the benchmark obtained by each division is the record with the smallest (or largest) keyword in the unordered area, and the comparison times required for quick sorting are the most.


(2) The best time complexity

In the best case, the benchmark obtained for each partition is the "median" Record of the current unordered partition. The result is that the length of the left and right unordered subintervals of the benchmark is roughly equal. Total number of keyword comparisons:
0 (nlgn)
Note:

It is easier to analyze the comparison times with a recursive tree. Because the Left and Right subintervals are roughly the same after each division, the height of the recursive tree is O (lgn), and
The total number of keyword comparisons required in the division process cannot exceed n. Therefore, the total number of key words required for the entire sorting process is C (n) = O (nlgn ).

Because the number of records moved by quick sorting is not greater than the number of comparisons, the worst time complexity of quick sorting is 0 (N2), and the best time complexity is O (nlgn ).


(3) Selection of benchmark keywords

Selecting the benchmark keyword of the partition in the unordered area is the key to determining the algorithm performance.

  ① Rule of "getting three"

The "Three get in progress" rule compares the keywords at the beginning, end, and center of the interval in the current interval, and takes the records corresponding to the values of the three as the benchmark, before the Division starts, the benchmark record is exchanged with the 1st records of the region. The subsequent division process is identical with the preceding partition algorithm.



  
② Obtain the random number K (low ≤ k ≤ high) between low and high, and use R [k] as the reference.

The best way to select a benchmark is to use a random function to generate a random number K (low ≤ k ≤ high) between low and high, and use R [k] as the benchmark, which is equivalent to forcing
Records in R [low... high] are randomly distributed. The quick sorting obtained by this method is generally called random quick sorting. Specific algorithms [see tutorial materials]
Note:
The fast sorting of randomization is slightly different from the general quick sorting algorithm. However, after randomization, the performance of the algorithm is greatly improved, especially for initial ordered files, it is generally impossible to cause the worst case. Randomization is not only applicable to fast sorting, but also to other algorithms that require random distribution of data.

(4) Average time complexity

Although the worst time for fast sorting is O (n2), it is the fastest in the internal Sorting Algorithm Based on keyword comparison in terms of average performance. Its average time complexity is O (nlgn ).


(5) spatial complexity

Quick sorting requires a stack inside the system to implement recursion. If each division is uniform, the height of the recursive tree is O (lgn). Therefore, the stack space is O (lgn) after recursion ). In the worst case, the height of the recursive tree is O (n), and the required stack space is O (n ).


(6) Stability

Fast sorting is unstable, such as [2, 2, 1].

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.