Internal Sorting Algorithm (I): Exchange sorting (Bubble sorting, fast sorting)

Last Update:2014-08-29 Source: Internet

Author: User

Tags benchmark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Internal Sorting Algorithm (I): Exchange sorting (Bubble sorting, fast sorting)

This is the first article in my blog series "Internal sorting algorithms. Sorting is to sort records in files in ascending (or descending) Order of keywords. The so-called internal sorting refers to the process in which, if the entire file is processed in the memory, the sorting does not involve the exchange of internal and external data (the definition of external sorting is the opposite ).

The internal sorting method can be divided into five categories by policy: insertion sorting, selection sorting, exchange sorting, Merge Sorting, and allocation sorting. The storage method of files to be sorted uses sequence tables (or directly uses vectors) as the storage structure (other storage structures also use linked lists as the storage structure ).

In this series of blog posts, I analyzed the Sorting Algorithm (including the time-space complexity analysis of the algorithm) based on the given sorting algorithm. The Analysis of sorting algorithms includes the analysis of algorithm ideas, the analysis of algorithm time complexity, the analysis of algorithm space complexity, and the analysis of the stability of sorting algorithms (in the file to be sorted, if multiple records with the same keywords exist, the relative sequence of these records with the same keywords remains unchanged after sorting, and the sorting method is stable ).

This article describes the exchange sorting. The basic idea of exchanging sorting is to compare the keywords of the records to be sorted in pairs. If the order of the two records is the opposite, the two records are exchanged until there is no reverse order record. The basic concepts of application exchange sorting include Bubble sorting and quick sorting.

First, we will introduce Bubble sorting. Provides the Sorting Algorithm:

Step 1: Initialize. Initialize file record R. R [1 .. n] Is unordered.

Step 2: scan. The weights of two adjacent bubbles are compared from the bottom of the unordered area to the top. If the light bubbles are found to be in the lower and severe bubbles, the positions of the two bubbles are exchanged. That is, compare (R [n], R [n-1]), (R [n-1], R [N-2]),…, (R [2], R [1]); for each pair of bubbles (R [j + 1], R [j]), if R [j + 1]. key

Step 3: Judge. Determine whether the file is ordered. If it is ordered, end the algorithm. Otherwise, step 2 is returned to continue the next scan.

Bubble AlgorithmFeatures:

During the I-trip scan, R [1 .. I-1] and R [I.. n] are the current sequential and disordered areas, respectively. The scan continues from the bottom of the unordered area to the top of the area. When scanning is completed, the shortest bubbles in the area float to the top position R [I]. The result is that R [1. I] is changed to a new ordered area. For example, when the first scan is complete, the "lightest" bubble floated to the top of the interval.
Each scan can only "sink" the heaviest bubble.
The entire Bubble sorting process requires at most n-1 sort. Because each sort adds a bubble to the ordered area, there are n-1 bubbles in the ordered area after N-1 sort, in the disordered area, the bubble weight is always greater than or equal to the bubble weight in the ordered area.
If no bubble position exchange is found in a sorting, it means that all bubbles in the unordered area to be sorted meet the principle of being light and heavy. Therefore, the Bubble sorting process can be terminated after this sorting. Therefore, in the following algorithm, a Boolean exchange is introduced, which is set to FALSE before each sort starts. If an exchange occurs during the sorting process, set it to TRUE. Check exchange at the end of sorting. If exchange has not occurred, terminate the algorithm and no longer perform the next sorting. The following code is provided:

# Include # Include # Define deusing namespace std; # define MaxSize 100 typedef int KeyType; void BubbleSort (KeyType R [], int n) {// Bubble Sorting Algorithm // R (l .. n) is the file to be sorted. It uses bottom-up scanning and Bubble Sorting for R. bool exchange; // The exchange sign for (int I = 1; I = I; j --) {// scan the current unordered zone R [I. n] from bottom up if (R [j + 1]

The specific process of Bubble Sorting is not provided. It is not difficult for readers to simulate it on paper. Analyze the bubble algorithm: Spatial Complexity Analysis: The auxiliary space required is O (1), that is, local sorting (if the auxiliary space required by the sorting algorithm is not dependent on the problem scale n, that is, the auxiliary space is O (1 ), it is called local sorting ). Worst time complexity of Algorithms: If the initial file is in reverse order, n-1 sort is required. A comparison of n-I keywords (1 ≤ I ≤ N-1) is required for each sort. In this case, the number of comparisons reaches the maximum value: n (n-1) /2 = O (n ^ 2 ).
Algorithm Stability: Apparently, the Bubble Sorting is stable. The following describes the quick sorting method. Give the Sorting Algorithm: (set the unordered area to R [low .. high])

Step 1: break down. In R [low .. select a record in high] as the benchmark to divide the current disordered zoning into two smaller subintervals (left and right) R [low .. pivotpos-1) and R [pivotpos + 1 .. high], and make the keywords of all records in the left subinterval less than or equal to the benchmark record (which may be recorded as benchmark) keyword begin, the keywords of all records in the subinterval on the right are greater than or equal to the limit value, while the benchmark record is located at the correct position (TPOs) and does not need to be sorted in the future.

Step 2: solve the problem. Use recursive call to quickly sort the Left and Right subintervals R [low... pivotpos-1] and R [small TPOs + 1 .. high.

Step 3: combination. Because when the two recursive calls in the "solving" Step end, the left and right subintervals are ordered. For quick sorting, the "Combination" step does not need to be done, and can be considered as a null operation.

Algorithm features:

The algorithm adopts a divide-and-conquer policy to resolve the original problem into several subproblems with smaller sizes but similar structures to the original problem. Recursively solve these subproblems, and then combine the solutions of these subproblems into the solutions of the original problem (this method also becomes the division and control method ).
The whole process of Quick Sort execution can be described by a recursive tree. The execution process of quick sort is equivalent First-order traversalIts Recursive tree. The following is an example:
VcCousWx7cq + tbHHsLT9xcXQ8rXEx/release + 6Gjy + release/sy9nFxdDytcTWtNDQuf2zzM/release + release 7eoUGFydGl0aW9uw + workshop/Workshop/i85LXEz8K957rNyc + workshop/Workshop/workshop + workshop/LX88mow + workshop + da4tcTOu9bDyc + jrNXiz + C1sdPaUltqXbrNu/nXv FJbaV0ovLRwaXZvdCm9 + release + de8udi8/release + release/nwulxezrv8npo6zv4s/release + de8udi8/NfWtcS8x8K80sa1vcHLu/release Release/qsq8z/LX88mow + ijrMjntMu9u8zmuMSx5Mmow + i3vc/release ++ release + release/nxvlnyvpzx1release vc3ryb25nprxet 723qKO6CgoK "three get in" rule, that is, in the current range, compare the keywords at the beginning, end, and center of the range, and take the records corresponding to the values of the three as the benchmark, before the Division starts, the benchmark record is exchanged with the 1st records of the region. The subsequent division process is identical with the preceding Partition algorithm. Take the random number k (low ≤ k ≤ high) between low and high, and use R [k] as the benchmark. The best way to select a benchmark is to use a random function to generate a random number k (low ≤ k ≤ high) between low and high, and use R [k] as the benchmark, this is equivalent to forcing R [low .. records in high] are randomly distributed. The quick sorting obtained by using this method is generally called Random fast sorting.
The code for the Quick Sort Algorithm is as follows:

# Include
      
       
# Include
       
        
# Include
        
         
Using namespace std; # define MaxSize 100 typedef int KeyType; int Partition (KeyType R [], int I, int j) {// pair R [low .. high] divide and return the location of the benchmark record // use the 1st records in the interval as the benchmark KeyType Limit = R [I]; // use the 1st records of the interval as the reference while (I
         
          
= Keys) // equals to position I j --; // scan right to left to find records with 1st keywords less than keys. key R [j] if (I
          
           
Invalid. key R [j --] = R [I]; // equivalent to switching R [I] and R [j]. After switching, the j pointer minus 1} R [I] = random; // The reference record has been finally positioned as return I; // The Position of the reference element} int RandomPartition (KeyType R [], int I, int j) {// pair R [low .. high] perform random division and return the location of the benchmark record // obtain the random number k (low ≤ k ≤ high) between low and high ), use R [k] as the benchmark srand (unsigned) time (NULL); int k = I + rand () % (j-I + 1 ); swap (R [k], R [I]); // The following steps are the same as the Partition function. Do not annotate KeyType Partition = R [I]; while (I
           
             = Rows) j --; if (I
            
             
Quick Sort Algorithm Analysis:
             Best time complexity: In the best case, the benchmark obtained by each division is the "medium value" Record of the current unordered partition. The division result is that the length of the left and right unordered subintervals of the benchmark is roughly equal. Total keyword comparison times: O (nlgn ). It is easier to analyze the comparison times with a recursive tree. Because the Left and Right subintervals are roughly the same after each division, the height of the recursive tree is O (lgn ), the sum of the number of key word comparisons required for the Division of nodes at each layer of the recursive tree does not exceed n. Therefore, the total number of keywords required for the entire sorting process is compared. C (n) = O (nlgn ).
             

             Space complexity: Quick sorting requires a stack inside the system to implement recursion. If each division is uniform, the height of the recursive tree is O (lgn). Therefore, the stack space is O (lgn) after recursion ). In the worst case, the height of the recursive tree is O (n), and the required stack space is O (n ).
             

             Stability: Fast sorting is unstable, for example, [2, 2, 1].

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More