Introduction to algorithms Chapter 8: linear time sorting

Source: Internet
Author: User

The algorithms mentioned above share a common nature: in the sorting result, the order of each element is compared based on the input time. We call these sorting algorithms comparative sorting.

 

8.1 compare the time lower bound of the Sorting Algorithm

Decision Tree Model

The process of comparative sorting can be abstracted as a decision tree. A decision tree is a full Binary Tree, indicating that a sorting algorithm acts on all the comparisons made by a given input. The execution of the sorting algorithm corresponds to traversing a path from the root of the tree to the leaf node. Each inner node corresponds to a comparison of AI & AJ. The left subtree determines the comparison after AI <= AJ, And the right subtree determines the comparison after AI> AJ. When a leaf node is reached, the Sorting Algorithm is determined. A necessary condition for a sorting algorithm to work correctly is the n of n elements! Which must appear as a leaf node of the decision tree. Set the height of the decision tree to H and the number of leaves to L, so there are 2 h> = L> = n !, So there is h> lgn! = Ω (nlgn ). This shows that the worst time complexity of comparative sorting is Ω (nlgn ). This also shows that the complexity of Merge Sorting and heap sorting has been progressively optimized.

 

 

Exercise:

8.1-1 What is the minimum possible depth of a leaf node in a decision tree with comparative sorting?

Analysis: n-1. Because it must be at least n-1 times. I wonder if there is any more theoretical proof?

 

 

8.1-3 proof: For N with a length of n! For at least half of input, there is no comparative Sorting Algorithm with linear time. To n! What about Part 1/n? What about 1/2n?

 

Analysis: assume that the depth of M leaf nodes in decision trees is H = O (N), then 2 h> M, so we can see that H is Ω (LGM ). Set M = n! /2, we can see that this is in conflict with H = O (n. Similarly, for 1/N * n! And 1/2n * n! The same is true.

 

8.1-4 There are n elements to be sorted. The input sequence is an N/K sub-sequence, each containing k elements. Each element in each sub-sequence is smaller than the element in the subsequent sub-sequence, greater than the element in the precursor subsequence. In this way, as long as the k elements in the subsequence are sorted, the sorting result of the entire input sequence is obtained. It is proved that the number of comparisons required in this Sorting Problem has a lower threshold (nlgk ).

 

Analysis: the number of sub-sequences of each K element is K !, The number of the entire sequence in the preceding conditions is (K !) N/K. Based on the decision tree analysis method, the depth of the decision tree is h> LG (K !) N/K) = N/K lg (K!)> N/K lg (K/2) K/2 = N/2lgk/2. Therefore, H = Ω (nlgk ).

 

 

8.2 count sorting

Count sorting assumes that each of the N input elements is an integer between 0 and K. Here, K is an integer. When K = O (n), the running time of counting sorting is round (n ).

 

The idea of counting sorting is to determine the number of elements smaller than X for each input element x. With this information, you can place x directly to the position of the final output array.

 

The following is the pseudo code for counting and sorting. Assume that the input is an array a [1... n], which stores the sorting result B [1... n], and C [0... k].

Counting-sort (a, B, K)

1 For I <-- 0 to K

2 do C [I] <-- 0

3 For j <-- 1 to length [A]

4 Do C [A [J] <-- C [A [J] + 1

5 For I <-- 1 to K

6 Do C [I] = C [I] + C [I-1]

7 For I <-- length [a] downto 1

8 do B [C [A [I] = A [I]

9 C [A [I] = C [A [I]-1

 

The running time of counting sorting is round (n + k), and the counting sorting is stable. The reason why counting sorting breaks through the previously mentioned limit of Ω (nlgn) is that it is not based on element comparison, and it is based on the prior knowledge of element values. Although the progressive complexity of counting sorting is better than that of comparing sorting, the constant of counting sorting is obviously large and not local sorting. Therefore, we need to consider factors such as the input characteristics, input size, and memory limits in actual selection.

 

Exercise:

8.2-4 Please give an algorithm to preprocess the given n integers between 0 and K, and can be in O (1) how many of the input integers are returned in the response time? [... b. The pre-processing time of the algorithm you provided is percentile (n + k ).

Analysis: obtains the array C [0... K] by using the preprocessing method in counting sorting, so that C [I] is not greater than the number of I elements. As a result, the number of elements falling into the range of [a... B] is C [B]-C [A-1].

 

 

8.3 base sorting

Assume that in array a with a length of N, each element has a D-digit number, where the first and fourth digits are the highest digits. The base sorting algorithm is as follows:

Radix-sort (a, d)

1 For I <-- 1 to d

2 do use a stable sort to sort array A on digit I

 

8.3: Given n d-digit digits, each digit can take K possible values. If the order is stable, the time required is round (n + k, the base sorting can be completed at the time of week (D (n + k.

Proof: If the Count sorting method is used for the stable sorting, it takes time to merge (n + k) each time it is processed, and D series must be processed in total, therefore, the running time of the base sorting is round (D (n + k )).

 

We have some flexibility in dividing keywords into several places.

8.4: Given N bits and any positive integer r <= B, Radix-Sort can be in round (B/R) (N + 2R )) sort these numbers correctly in time.

Proof: For an r <= B, each keyword is considered as a number composed of D = B/R bits. Each number is (0 ~ 2r-1. The total running time is running (B/R) (N + 2R )).

 

For the given n and B values, how can we select the R value to minimize the expression (B/R) (N + 2R ). If B <lgn has (n + 2R) = random (n) for any r <= B value, select r = B, the time for sorting the count is hour (B/B) (N + 2B) = hour (n ). If B> lgn is selected, r = lgn can be used to give the optimal time within a constant factor. When r = lgn is used, the algorithm complexity is Gini (Bn/lgn ), when R increases to LG, the molecular 2R increases the score to the parent R, so the running time complexity is Ω (Bn/lgn). If R decreases to below lgn, b/R increases, while N + 2R is still round (n ).

 

 

Exercise:

8.3-4 describes how to use ~ N numbers between n2-1 are sorted.

Analysis: Based on the theorem 8.4, obtain B = lg (N2) = 2lgn, r = lgn. The time complexity of base sort is second (2 (n + n) = second (n ).

 

 

8.4 bucket sorting

Counting sorting assume that the input is composed of integers in a range, while the bucket sorting assumes that the input is generated by a random process, which is evenly and independently distributed in the interval [0, 1.

 

The bucket sorting idea is to divide the interval [0, 1) into N subintervals of the same size, or buckets. Then, the number of N is distributed to each bucket. Because the distribution is even, there will be no more than one number in a bucket. To get the result, sort the elements in each bucket first, then, columns the elements in each bucket in order.

 

The following is the bucket sorting code. Assume that the input is an array a containing n elements, and each element meets a [I] <1. In addition, B [0... n-1] is an array of buckets, and it is assumed that some mechanism can be used to maintain these tables.

Bucket-sort ()

1 n <-- length [A]

2 For I <-- 1 to n

3 do insert a [I] into list B [Na [I]

4 For I <-- 0 to n-1

5 do sort list B [I] with insertion sort

6 concatenate the Lists B [0], B [1],..., B [n-1] together in order

 

Expected time analysis of Bucket sorting

After Ni is set to 3rd rows, the number of elements in each bucket can be expressed as T (n) = hour (n) + Σ I = 0 ~ N-1O ).

The expected number of elements in a bucket is analyzed. Assume that the random variable Xij = I {A [J] falls into bucket I }. P {Xij = 1} = 1/N.

Ni = Σ j = 1 ~ Nxij

E [n2] = E [Σ j = 1 ~ Nxij * Sigma j = 1 ~ Nxij] = E [Σ j = 1 ~ Nxij2 + Σ j = 1 ~ N Σ k = 1 ~ N, K! = Jxij xik] = Σ j = 1 ~ Ne [xij2] ++ Σ j = 1 ~ N Σ k = 1 ~ N, K! = Je [Xij xik]

E [xij2] = 1/n

E [xijxik] = 1/N * 1/n = 1/N2

 

E [n2] = 2-1/N.

We can see that E [T (n)] = random (n ).

 

Even if the input does not meet the uniform distribution, the bucket sorting can run in linear time, as long as the sum of squares of each bucket size is linear with the total number of elements.

 

Exercise:

8.4-4. There are N points in the unit circle, Pi = (XI, Yi), so that 0 <xi2 + yi2 <= 1, I =,..., n. Assuming that all the points are evenly distributed, that is, the probability that the point falls into any area of the circle is proportional to the area of the area. Design an algorithm for the expected time of minus (n) to sort N points based on the distance between the point and the origin.

Analysis: if the area of the circle is S, we can divide it into N regions of every area S/N, and because our goal is to sort by the distance from the center of the point, therefore, the area is divided around the center of the circle. The division method is as follows: select the distance between some columns D0, D1 ,..., DN, satisfying D0 = 0, π D12 = s/n, π D22 = 2 * s/n ,..., π dn2 = S. Then the distance from the center of the circle is the area between the di-1 to di constitute the bucket I.

 

Questions:

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.