Basic Ideas of Bucket sorting
The bucket sorting uses the ing relationship of the function to divide the array to be sorted into N blocks (buckets ). In fact, the function of calculating the F (k) value of Bucket sorting is equivalent to partitioning in the fast sorting, and a large amount of data has been divided into basic ordered data blocks (buckets ). Then, you only need to sort a small amount of data in each bucket by comparison (comparative sorting: exchange on the basis of comparison to achieve sorting effect.
Suppose that the columns to be sorted are K = {49, 38, 35, 97, 76, 73, 27, 49 }. All the data is between 1. Therefore, we can customize 10 buckets and then determine the ing function f (K) = (k * 10)/(K. max ). Then the first keyword 49 is located in 4th buckets (49*10/97 = 5 ). Stack all keywords to the bucket in sequence, and quickly sort each non-empty bucket, as shown in.
The array to be sorted is:
Then, you only need to sort these sub-arrays.
Applicability:
The number of arrays must be positive, but we can add a value a to each number to change it to positive, and then subtract a after sorting.
Time Complexity:
(1) Calculate the bucket ing function for each keyword cyclically. the time complexity is O (n ).
(2) Use advanced comparative sorting algorithms to sort all data in each bucket. the time complexity is as follows: for n data to be sorted and M data buckets, average bucket sorting time for each [N/m] data complexity: O (n) + O (M * (N/m) * log (N/m )) = O (N + N * (logn-logm) = O (N + N * logn-N * logm ).
The improved algorithm is as follows:
(A) The ing function f (k) can evenly allocate n data records to M buckets so that each bucket has [N/m] data records. When N = m, that is, at the limit, each bucket has only one data. The best efficiency of Bucket sorting can reach O (n), the worst (all elements in a bucket ).
(B) Increase the number of buckets as much as possible. In extreme cases, each bucket can only obtain one data, thus completely avoiding the "Compare" sorting operation of the data in the bucket. Of course, it is not easy to do this. When the data volume is huge, the F (k) function will cause a large number of Bucket sets and a serious waste of space. This is a trade-off between the time and space costs.
C ++ code:
# Include <iostream> using namespace STD; namespace mysort {float max (float * array, int begin, int end) {// obtain the maximum number of Float ret =-1 in the array; for (INT I = begin; I <= end; ++ I) ret = (Ret <array [I])? Array [I]: ret; return ret;} int getindex (float a, float max) {// obtain the index position of the bucket. Return (INT) (A * 10)/max);} void insertsort (float * array, int begin, int end) {for (INT I = begin; I <end; ++ I) {Int J = I + 1; float TMP = array [J]; for (; j> 0; j --) {If (TMP <array [J-1]) array [J] = array [J-1]; elsebreak;} array [J] = TMP ;}} void radixsorting (float array [], int begin, int end) {int arraysize = end-begin + 1; float max = max (array, begin, end ); const int countsize = 11; // (N * 10/M & n <m) there are 11 possible int count [countsize]; float * temp = new float [arraysize]; memset (void *) count, 0, sizeof (count); // Initialization is 0for (INT I = begin; I <= end; ++ I) // record the number of elements in each bucket {count [getindex (array [I], max)] + = 1;} For (INT I = 0; I <countSize-1; ++ I) {count [I + 1] + = count [I] ;}for (INT I = end; I >= begin; -- I) {Int J = getindex (array [I], max); temp [count [J]-1] = array [I]; -- count [J];} for (INT I = 0; I <countsize-1; ++ I) {If (count [I] <count [I + 1]) {mysort: insertsort (temp, count [I], Count [I + 1]-1) ;}} memcpy (void *) array, (void *) temp, arraysize * sizeof (float )); delete [] temp ;}}; int main () {float a [] = {49, 38, 35, 97, 76, 73, 27, 49 }; int length = sizeof (a)/sizeof (float); // mysort: insertsort (A, 0, length-2); mysort: radixsorting (A, 0, length-1); Return 0 ;}
Summary:The bucket sorting and counting sorting have striking similarities. Counting sorting projects each element in each space of the array, but it must be an integer and consumes a large amount of space, which is related to the specific number of elements to be sorted. Bucket sorting uses a smaller space (O (N) and is independent of the number of objects to be sorted). Only the indexes of each bucket are recorded, but data of each bucket needs to be sorted, wider application scope. The space-for-time method for counting and sorting is very similar to hash.