Internal Sorting Algorithm (I): Exchange sorting (Bubble sorting, fast sorting)

Last Update:2014-08-26 Source: Internet

Author: User

Tags benchmark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is the first article in my blog series "Internal sorting algorithms. Sorting is to sort records in files in ascending (or descending) Order of keywords. The so-called internal sorting refers to the process in which, if the entire file is processed in the memory, the sorting does not involve the exchange of internal and external data (the definition of external sorting is the opposite ).

The internal sorting method can be divided into five categories by policy: insertion sorting, selection sorting, exchange sorting, Merge Sorting, and allocation sorting. The storage method of files to be sorted uses sequence tables (or directly uses vectors) as the storage structure (other storage structures also use linked lists as the storage structure ).

In this series of blog posts, I analyzed the Sorting Algorithm (including the time-space complexity analysis of the algorithm) based on the given sorting algorithm. The Analysis of sorting algorithms includes the analysis of algorithm ideas, the analysis of algorithm time complexity, the analysis of algorithm space complexity, and the analysis of the stability of sorting algorithms (in the file to be sorted, if multiple records with the same keywords exist, the relative sequence of these records with the same keywords remains unchanged after sorting, and the sorting method is stable ).

This article describes the exchange sorting. The basic idea of exchanging sorting is to compare the keywords of the records to be sorted in pairs. If the order of the two records is the opposite, the two records are exchanged until there is no reverse order record. The basic concepts of application exchange sorting include Bubble sorting and quick sorting.

First, we will introduce Bubble sorting. Provides the Sorting Algorithm:

Step 1: Initialize. Initialize file record R. R [1 .. n] Is unordered.

Step 2: scan. The weights of two adjacent bubbles are compared from the bottom of the unordered area to the top. If the light bubbles are found to be in the lower and severe bubbles, the positions of the two bubbles are exchanged. That is, compare (R [N], R [n-1]), (R [n-1], R [N-2]),…, (R [2], R [1]); for each pair of bubbles (R [J + 1], R [J]), if R [J + 1]. key <R [J]. key, then the contents of R [J + 1] and R [J] are exchanged.

Step 3: Judge. Determine whether the file is ordered. If it is ordered, end the algorithm. Otherwise, step 2 is returned to continue the next scan.

Bubble AlgorithmFeatures:

During the I-trip scan, R [1 .. I-1] and R [I.. N] are the current sequential and disordered areas, respectively. The scan continues from the bottom of the unordered area to the top of the area. When scanning is completed, the shortest bubbles in the area float to the top position R [I]. The result is that R [1. I] is changed to a new ordered area. For example, when the first scan is complete, the "lightest" bubble floated to the top of the interval.
Each scan can only "sink" the heaviest bubble.
The entire Bubble sorting process requires at most n-1 sort. Because each sort adds a bubble to the ordered area, there are n-1 bubbles in the ordered area after N-1 sort, in the disordered area, the bubble weight is always greater than or equal to the bubble weight in the ordered area.
If no bubble position exchange is found in a sorting, it means that all bubbles in the unordered area to be sorted meet the principle of being light and heavy. Therefore, the Bubble sorting process can be terminated after this sorting. Therefore, in the following algorithm, a Boolean exchange is introduced, which is set to false before each sort starts. If an exchange occurs during the sorting process, set it to true. Check exchange at the end of sorting. If exchange has not occurred, terminate the algorithm and no longer perform the next sorting.

The following code is provided:

# Include <cstdio> # include <cstdlib> # include <algorithm> using namespace STD; # define maxsize 100 typedef int keytype; void bubblesort (keytype R [], int N) {// Bubble Sorting Algorithm // R (L .. n) is the file to be sorted. It uses a bottom-up scan to perform a Bubble Sorting bool exchange on R; // The exchange flag for (INT I = 1; I <N; I ++) {// do a maximum of N-1 sort exchange = false; // The exchange flag should be false for (Int J = n-1; j> = I; J --) {// for the current unordered zone R [I .. n] from bottom up scan if (R [J + 1] <R [J]) Swap (R [J], R [J + 1]); // exchange = true; // The exchange flag is set to true due to exchange} If (! Exchange) // This sort order has not been exchanged, and the return algorithm is terminated in advance ;}} ******* * ******************* int main () {int N, R [maxsize]; printf ("Enter the array size: \ n"); scanf ("% d", & N ); printf ("enter an array element: \ n"); For (INT I = 1; I <= N; I ++) scanf ("% d ", & R [I]); printf ("Before sorting: \ n"); For (INT I = 1; I <= N; I ++) printf ("% d", R [I]); printf ("\ n"); bubblesort (R, n); printf ("after sorting: \ n "); for (INT I = 1; I <= N; I ++) printf ("% d", R [I]); printf ("\ n "); return 0 ;}

The specific process of Bubble Sorting is not provided. It is not difficult for readers to simulate it on paper. Analyze the bubble algorithm:

Spatial Complexity Analysis: The auxiliary space required is O (1), that is, local sorting (if the auxiliary space required by the sorting algorithm is not dependent on the problem scale N, that is, the auxiliary space is O (1 ), it is called local sorting ).

Worst time complexity of Algorithms: If the initial file is in reverse order, n-1 sort is required. A comparison of n-I keywords (1 ≤ I ≤ N-1) is required for each sort. In this case, the number of comparisons reaches the maximum value: n (n-1) /2 = O (N ^ 2 ).

Algorithm Stability: Apparently, the Bubble Sorting is stable.

The following describes the quick sorting method. Give the Sorting Algorithm: (set the unordered area to R [low .. High])

Step 1: break down. In R [low .. select a record in high] as the benchmark to divide the current disordered zoning into two smaller subintervals (left and right) R [low .. pivotpos-1) and R [pivotpos + 1 .. high], and make the keywords of all records in the left subinterval less than or equal to the benchmark record (which may be recorded as benchmark) keyword begin, the keywords of all records in the subinterval on the right are greater than or equal to the limit value, while the benchmark record is located at the correct position (TPOs) and does not need to be sorted in the future.

Step 2: solve the problem. Use recursive call to quickly sort the Left and Right subintervals R [low... pivotpos-1] and R [small TPOs + 1 .. High.

Step 3: combination. Because when the two recursive calls in the "solving" Step end, the left and right subintervals are ordered. For quick sorting, the "Combination" step does not need to be done, and can be considered as a null operation.

Algorithm features:

The algorithm adopts a divide-and-conquer policy to resolve the original problem into several subproblems with smaller sizes but similar structures to the original problem. Recursively solve these subproblems, and then combine the solutions of these subproblems into the solutions of the original problem (this method also becomes the division and control method ).
The whole process of Quick Sort execution can be described by a recursive tree. The execution process of quick sort is equivalentFirst-order traversalIts Recursive tree. The following is an example:

It is a recursive tree that sorts integers at intervals of, 38, and. The left square brackets of each node in the recursive tree indicate the range to be sorted. the keywords in the node are the benchmark Keywords of the Division, the content in the right-side parentheses of each branch node indicates the result returned after the sorting process of the left-side interval of the node is completed. The execution sequence of an algorithm is the arrow sequence in the recursive tree. In fact, when Division operations are regarded as Access Node operations, the execution process of quick sorting is equivalent to traversing its recursive tree in order.

The partition algorithm partition in step 1 is described as follows: Step 1: (initialize) set the two pointers I and j. Their initial values are the lower and upper bounds of the interval, that is, I = low, j = high; select the first record R [I] (R [low]) in the unordered area as the benchmark record and save it in the variable comment. Step 2: scan J from high to left until you find the record R [J] with 1st keywords less than limit and move R [J]) to the position indicated by I, this is equivalent to switching between R [J] and benchmark R [I] (I .e., benchmark), so that records with the keyword smaller than the benchmark keyword commit are moved to the left of the benchmark, after the switch, the R [J] is equivalent to the round. Then, the I pointer starts scanning right from the position I + 1 until the record R [I] with 1st keywords greater than the round is found. moving R [I] to the position indicated by J is equivalent to exchanging R [I] and benchmark R [J], so that records with keywords greater than the benchmark keyword are moved to the right of the benchmark, after switching R [I] is equivalent to storing the cursor; then let the pointer J from the position of the J-1 began to left scanning, so alternate change The Scanning direction is changed from the two ends to the center. When I = J, I is the final position of the benchmark, where the benchmark is placed to complete a division. For the division algorithm, selecting the benchmark keyword of the division in the current unordered area is the key to determining the algorithm performance. Generally, there are two types: Select benchmark keywordsMethod:

The "Three get in progress" rule compares the keywords at the beginning, end, and center of the interval in the current interval, and takes the records corresponding to the values of the three as the benchmark, before the Division starts, the benchmark record is exchanged with the 1st records of the region. The subsequent division process is identical with the preceding partition algorithm.
Take the random number K (low ≤ k ≤ high) between low and high, and use R [k] as the benchmark. The best way to select a benchmark is to use a random function to generate a random number K (low ≤ k ≤ high) between low and high, and use R [k] as the benchmark, this is equivalent to forcing R [low .. records in high] are randomly distributed. The quick sorting obtained by using this method is generally calledRandom fast sorting.

The code for the Quick Sort Algorithm is as follows:

# Include <cstdio> # include <cstdlib> # include <algorithm> # include <time. h> using namespace STD; # define maxsize 100 typedef int keytype; int partition (keytype R [], int I, Int J) {// pair R [low .. high] divide and return the location of the benchmark record // use the 1st records in the interval as the benchmark keytype Limit = R [I]; // use the 1st records of the interval as the benchmark while (I <j) {// scanning from the two ends of the interval to the center alternately, until I = J while (I <J & R [J]> = cursor) // The cursor is equivalent to j --; // scanning from right to left on position I, search for 1st keywords less than limit. key record R [J] if (I <j) // indicates the keyword of the found R [J] <j. key R [I ++] = R [J]; // equivalent to switching R [I] and R [J], after the switch, I pointer plus 1 while (I <J & R [I] <= cursor) // cursor is equivalent to I ++ on Position J; // scan left to right, search for more than 1st keywords. key record R [I] if (I <j) // indicates that R [I] is found, so that R [I]. key> keys. key R [j --] = R [I]; // equivalent to switching R [I] and R [J]. After switching, the J pointer minus 1} R [I] = random; // The reference record has been finally positioned as return I; // The Position of the reference element} int randompartition (keytype R [], int I, Int J) {// pair R [low .. high] perform random division and return the location of the benchmark record // obtain the random number K (low ≤ k ≤ high) between low and high ), use R [k] as the benchmark srand (unsigned) Time (null); int K = I + rand () % (J-I + 1 ); swap (R [K], R [I]); // The following steps are the same as the Partition Function and do not annotate keytype partition = R [I]; while (I <j) {While (I <J & R [J]> = rows) j --; if (I <j) R [I ++] = R [J]; while (I <J & R [I] <= random) I ++; if (I <j) R [j --] = R [I];} R [I] = require; return I;} void quicksort (keytype R [], int low, int high) {// pair R [low .. high] fast sorting int pivotpos; // the location of the benchmark record after division if (low 

Quick Sort Algorithm Analysis:

 
 
  Best time complexity: In the best case, the benchmark obtained by each division is the "medium value" Record of the current unordered partition. The division result is that the length of the left and right unordered subintervals of the benchmark is roughly equal. Total keyword comparison times: O (nlgn ). It is easier to analyze the comparison times with a recursive tree. Because the Left and Right subintervals are roughly the same after each division, the height of the recursive tree is O (lgn ), the sum of the number of key word comparisons required for the Division of nodes at each layer of the recursive tree does not exceed n. Therefore, the total number of keywords required for the entire sorting process is compared. C (n) = O (nlgn ).

 
  Space complexity: Quick sorting requires a stack inside the system to implement recursion. If each division is uniform, the height of the recursive tree is O (lgn). Therefore, the stack space is O (lgn) after recursion ). In the worst case, the height of the recursive tree is O (n), and the required stack space is O (n ).

 
  Stability: Fast sorting is unstable, for example, [2, 2, 1].


 


 
 
  
 
  

 
  

 

Internal Sorting Algorithm (I): Exchange sorting (Bubble sorting, fast sorting)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More