Introduction to algorithms-LEC 08 linear time sorting

Source: Internet
Author: User

We have introduced several O (nlgn) sorting algorithms: Fast sorting, Merge Sorting, and heap sorting, this section describes the lower bound of a comparison-based Sorting Algorithm and several linear time-based sorting algorithms-counting sorting, base sorting, and bucket sorting.

1. Compare the lower bound of the Sorting Algorithm

1. Decision Tree Model: Comparison sorting can be abstracted as a decision tree. A decision tree is a full Binary Tree, indicating that a sorting algorithm acts on all the comparisons made by a given input.


The execution of the sorting algorithm corresponds to traversing a path from the root of the tree to the leaf node. Make a comparison at each node.

To make the Sorting Algorithm work correctly, the necessary condition is that n of n elements! Each of these columns must appear as a leaf of a decision tree.

2. In a decision tree, the longest path length from the root to any reachable leaf node indicates the number of comparisons in the worst case of the corresponding sorting algorithm. In this way, the worst case comparison times in a comparative sorting algorithm are equal to the height of the decision tree.

3,Theorem:In the worst case, any comparative sorting algorithm requires a comparison of Ω (nlgn) times.

Proof: For a decision tree with each arrangement serving as a reachable leaf node, the height of the decision tree can be determined based on the previous discussion. Consider a decision tree with H-high and with L reachable leaf nodes. It corresponds to the comparative sorting of n elements. Because N input elements have n in total! Arrangement, each of which appears as a leaf in the tree, so there are n! ≤ L. Because in a binary tree with a height of H, the number of leaves is no more than 2 ^ h

N! <= L <= 2 ^ h

Obtain the logarithm of this formula.
H ≥ lg (N !) = Ω (nlgn)

3,Inference:Heap sorting and Merge Sorting are incremental and optimal comparison algorithms.

Proof: the upper bound O (nlgn) of heap sorting and Merge Sorting is consistent with the lower bound Ω (nlgn) of the worst case given in Theorem 8.1.

Note: quick sorting is not the most progressive comparison algorithm, because the worst running time of quick sorting is O (n ^ 2)


Ii. Counting sorting

Counting sorting assume that each of the N input elements is an integer between 0 and K. Here K is an integer. When k = O (N, the running time of technical sorting is O (n ).

1. Basic Idea: the basic idea of counting sorting is to determine the number of elements smaller than X for each input element x. With this information, you can place x directly in its position in the final output array.

Count-Sort(A, B, k)1 for i <-- 0 to k2 do C[i] <-- 03 for i <-- 1 to length[A]4 do C[A[i]] <-- C[A[i]] + 15 for i <-- 1 to k6 do C[i] <-- C[i-1] + C[i]7 for j <-- length[A] downto 18 do B[C[A[j]]] <-- A[j]9 C[A[j]] <-- C[A[j]] - 1
Since each element may not necessarily be different, each time a [J] is put into array B, the value of C [A [J] is reduced. This causes the next input element whose value is equal to a [J] (if any) to directly enter the previous position of a [J] in array B.

2. Cost of counting sorting time: cost (K + n). In practice, when k = O (n), the running time is O (n ).

3. Counting sorting features:

A. the counting sorting algorithm does not use the comparison between elements. It uses the actual values of elements to determine their positions in the output array. It is not a comparison-based sorting algorithm, therefore, its lower bound computing time is no longer Ω (nlogn );

B. Because the downto statement is used in the 4th rows of the algorithm, after counting and sorting, the relative order of elements with the same median in the output sequence is the same as the order of their pairs in the input sequence, therefore, the Count sorting algorithm isStable Sorting Algorithm. It is used in satellite data sorting and base sorting.

Iii. Base sorting

Radix sort is an algorithm used on an old-fashioned card-wearing machine.

1. intuitively, you may think that you should sort by the most important bit, recursively sort the numbers in each box, and finally combine the results. Contrary to human intuition, the base sorting isSort by the lowest valid DigitTo solve the Card Sorting Problem.

2. Base Sorting Algorithm

Radix-Sort(A, d)1 for i <-- 1 to d2 do use a stable sort to sort array A on digit i

3, Theorem:Given n d-digits, each number can take the possible values in K. If the used stable sorting requires the time of round (n + k), the base sorting algorithm can sort these numbers correctly at the time of round (D (n + k.

4,Theorem:Given N bits and any positive integer r ≤ B, Radix-Sort can be in round (B/R) * (N + 2 ^ r )) sort these numbers correctly in time.

For the given n and B values, we hope that the selected R value (r ≤ B) can minimize the expression (B/R) (N + 2 ^ r ).

A. If B <floor (log n), all r value that satisfies r <= B has (n + 2 ^ R) = random (n ). Therefore, selectR = BThe obtained running time is (B/B) (N + R ^ B) = running (N). In a progressive sense, this time is the best.

B. If B> = floor (log n), selectR = floor (log n)To give the best time within a constant factor, so that the running time is optimized (Bn/lgn ).

If R is increased, 2 ^ r in the numerator increases faster than R in the parent, resulting in an increase in running time.

If R is reduced, items B/R will increase, but the molecule is still partial (n). The final result will still increase the running time.

5. sort by base and compare:

A. If B = O (lgn) is selected and R ≈ lgn is selected, the running time of the base sorting is O (n ), this looks better than the average time of fast sorting O (nlgn.

B. The constant factors hidden in Odo numbers in these two periods are different. For the N keywords to be processed, although the number of times that the base sort is executed may be less than the number of times that the quick sort is performed, the time required for each round is much longer.

C. In addition, Count sorting is used as the base sorting of the intermediate stable sorting.Not in-situ sortingWhile many O (nlgn) Time Comparison sorting algorithms can achieve in-situ sorting.Therefore, when the memory capacity is precious, in-situ sorting algorithms such as quick sorting may be more desirable.

Iv. Sort buckets

1. Basic Idea: the bucket sorting idea is to divide the interval [0, 1) into N subintervals of the same size, or buckets. Then, N inputs are distributed to each bucket. Because the input numbers are evenly distributed independently and evenly distributed in [0, 1), there is usually no large number in a bucket. To get the result, first sort the number in each bucket, and then list the elements in each bucket in order.

2. In the code of the bucket sorting algorithm, assume that the input is an array a containing n elements, and each element must be 0 ≤ A [I] <1. In addition, an auxiliary array B [0 .. n-1] is required to store the buckets implemented by the linked list, and it is assumed that these tables can be maintained by some mechanism.

Bucket-Sort(A)1 n<--length[A]2 for i <-- 1 to n3 do insert A[i] into list B[floor(n*A[i])]4 for i <-- 0 to n-15 do sort list B[i] with insertion sort6 concatenate the list B[0]...B[n-1] together in order
3. Run Time: Except for 5th rows, the worst time of all rows is O (n ). The only part that needs to be analyzed is the time it takes to insert the sorting in the 5th rows.

The run time of the bucket sorting is as follows:

Expectation on both sides, using linear properties:

The following formula applies to each I, because every value in input array a may fall in any bucket.


To prove the above formula, we define the random indicator variable Xij = I {A [J] falling into bucket I}. The probability of Xij being 1 is 1/N.

So the number of elements in the bucket I is:

Re-combine the square item extension:

Because the probability of Xij being 1 is 1/N

So,

Therefore, the expected running time of Bucket sorting isRound (n) + N * O (2-1/n) = round (N)


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.