Classic sorting algorithms: 5. linear time sorting (counting, base, and bucket sorting)

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Preface

This is the fifth article on the classic sorting algorithm series. It provides three linear time sorting types: Count sorting, base sorting, and bucket sorting.

For more information about sorting algorithms, see:

Classic sorting algorithms: 1. Deep analysis from simple sorting to heap sorting

Typical sorting algorithms: 2. Analysis of insertion sorting and Performance Comparison

Classic sorting algorithms: 3. Bubble Sorting & quick sorting

Classic sorting algorithms: iv. Merge Sorting

Typical sorting algorithms: 5. linear time sorting (counting, base, and bucket sorting)

Typical sorting algorithms: 6. Summary of various sorting algorithms

Note: For ease of description, a [0] is not considered in this article and the source code. The default subscript starts from 1.

2. Sort the count by 2.1

In the previous four blogs, all sorting algorithms are compared, which can be called "comparative sorting". The lower bound of comparative sorting is O (nlogn ). Is there a linear time sorting algorithm with a time complexity of O (n? Counting sorting is a basic linear time sorting, which is the basis of base sorting. The basic idea is: to determine the number of elements smaller than X for each element x, you can place x directly in its position in the ordered sequence. Process description: assume that the range of the median value in sequence a is [0, K], where k indicates the maximum value in the sequence to be sorted. First, use an auxiliary array count to record the number of times each value appears in A. For example, Count [I] indicates the number of I in. Then, change the element values in count in sequence, so that count [I] indicates the number of elements in a not greater than I. Then, the elements in array a are scanned from the back to the back. The elements in array a are directly placed in the secondary array B based on the information in the count. Finally, Copy Sequence B to sequence.

Code 2.2

# Include <stdio. h> # include <stdlib. h> // count sorting. n is the number of records in array a. k is the maximum value of void countingsort (int * a, int N, int K) {int I; int * COUNT = (int *) malloc (sizeof (INT) * (k + 1); int * B = (int *) malloc (sizeof (INT) * (n + 1); // initialize the Count array countfor (I = 0; I <= K; I ++) * (count + I) = 0; // calculate the number of records equal to a [I] for (I = 1; I <= N; I ++) (* (count + A [I]) ++; // calculate the number of records less than or equal to a [I] for (I = 1; I <= K; I ++) * (count + I) + = * (count + i-1); // scan the array and place each element in the corresponding position of the ordered sequence for (I = N; I> = 1; I --) {* (B + * (count + A [I]) = A [I]; (* (count + A [I]) --;} for (I = 1; I <= N; I ++) A [I] = * (B + I); free (count); free (B );} void main () {int I; int A [7] = {, 9, 1, 2}; // do not consider a [0] countingsort (A, 6, 9 ); for (I = 1; I <= 6; I ++) printf ("%-4D", a [I]); printf ("\ n ");}

2.3 Efficiency Analysis

From the code point of view, there are five for loops in counting sorting, where three times are N and two times are K. Therefore, the total time t (3N + 2 K) and the time complexity O (N + k) remain unchanged, regardless of the worst or best condition. in addition, the counting sorting is stable, and the auxiliary space N + k is relatively large, there are constraints for counting and sorting to sort a sequence (for example, we suppose that the range of the value of a is [0, K], where k indicates the maximum value of the sequence to be sorted ), the element value must be non-negative. If K is too large, the efficiency will be greatly reduced. Note that
"Scan array a to place each element in the corresponding position of the ordered sequence". Why do we need to scan array a from the back? As you may think about the counting sorting process, the counting sorting is unstable due to the previous scanning. As mentioned above, counting sorting is the basis of the base sorting, therefore, its stability directly affects the stability of the base sorting.

3. Sort the base by 3.1

In counting sorting, when K is very large, the overhead of Time and Space will increase (you can think about sorting the series {8888,1234, 9999} by count, which will not only waste a lot of space, in addition, the time is not as ordered ). Therefore, the records to be sorted can be divided into one (first), ten (second)... and then the whole sequence is sorted by count in the first and second places. In this case, the maximum value of each bit in the sorted sequence is 9.

Code 3.2

# Include <stdio. h> # include <stdlib. h> # include <math. h> // sort by count. n is the number of records in array A, K is the maximum value in the record, and void countingsort (int * a, int N, int K, int d) {int I; int * COUNT = (int *) malloc (sizeof (INT) * (k + 1); int * B = (int *) malloc (sizeof (INT) * (n + 1); // initialize the Count array countfor (I = 0; I <= K; I ++) * (count + I) = 0; // calculate the value equal to a [I] in D (A [I]/(INT) Pow (10, D-1) % 10) for (I = 1; I <= N; I ++) (* (count + A [I]/(INT) Pow (10, D-1) % 10) ++; // calculate less than or equal to a [I] in D (A [I]/(INT) Pow (10, D-1) % 10) the number of records for (I = 1; I <= K; I ++) * (count + I) + = * (count + i-1); // scan a array, place each element in the corresponding position of the ordered sequence for (I = N; I> = 1; I --) {* (B + * (count + A [I]/(INT) Pow (10, D-1) % 10) = A [I]; (* (count + A [I]/(INT) Pow (10, D-1) % 10) --;} for (I = 1; I <= N; I ++) A [I] = * (B + I); free (count); free (B);} // The base sorting. N indicates the number of records of array, each record contains a D-digit void radixsort (int * a, int N, int d) {int I; for (I = 1; I <= D; I ++) {countingsort (A, 6, 9, I) ;}} void main () {int I; int A [7] ={ 0,114,118,152,114,111,132 }; // do not consider a [0] radixsort (A, 6, 3); for (I = 1; I <= 6; I ++) printf ("%-4D ", A [I]); printf ("\ n ");}

3.3 Efficiency Analysis

Base sorting time t (n) = D * (2 k + 3n), where D is the number of digits of the record value, (2 k + 3n) is the sorting time of each count, as analyzed above, K does not exceed 9, and D is usually very small. K and D can be regarded as a very small constant, so the time complexity O (n ). The worst case does not change the time complexity. The base sorting is stable. Auxiliary space sorted by Count K + N.

4. Sort buckets by 4.1

Like counting and sorting, bucket sorting also assumes that the sequence is generated by a random process, which evenly and independently distributes elements in the interval [0, 1). The basic idea is to divide the interval [0, 1) into N subintervals of the same size, called buckets. Distribute N records to each bucket. If more than one record is distributed to the same bucket, sort the records in the bucket. Finally, the records in each bucket are listed in sequence to get an ordered sequence.

Code 4.2

# Include <stdio. h> # include <stdlib. h> // bucket sorting void bucketsort (double * a, int N) {// description of the linked list node typedef struct node {double key; struct node * Next;} node; // auxiliary array element description typedef struct {node * Next;} head; int I, j; head [10] = {null}; node * P; node * q; node * node; for (I = 1; I <= N; I ++) {node = (node *) malloc (sizeof (node )); node-> key = A [I]; node-> next = NULL; P = q = head [(INT) (a [I] * 10)]. next; If (P = NULL) {head [(INT) (a [I] * 10)]. next = node; continue;} while (p) {If (node-> key <p-> key) break; q = P; P = p-> next ;} if (P = NULL) {q-> next = node;} else {node-> next = P; q-> next = node;} j = 1; for (I = 0; I <10; I ++) {P = head [I]. next; while (p) {A [J ++] = p-> key; P = p-> next ;}} void main () {int I; double A [13] = {0, 0.13, 0.25, 0.18, 0.29, 0.81, 0.52, 0.52, 0.83 }; // do not consider a [0] bucketsort (A, 12); for (I = 1; I <= 12; I ++) printf ("%-6.2f ", A [I]); printf ("\ n ");}

4.3 Efficiency Analysis

When the records are evenly distributed in the bucket, that is, each bucket has only one element, and the time complexity is O (n ). Therefore, bucket sorting is suitable for sorting records with few duplicates. Auxiliary space 2n. Bucket sorting is a stable sorting, which is complicated to implement.

5. Appendix

Reference books: Introduction to Algorithms

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Classic sorting algorithms: 5. linear time sorting (counting, base, and bucket sorting)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Classic sorting algorithms: 5. linear time sorting (counting, base, and bucket sorting)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support