Comprehensible data Structure C language version (22)--sorting decision tree and bucket sort

Source: Internet
Author: User

In (17) We make a simple analysis of the sorting algorithm, and come to two conclusions:

1. Sorting algorithm with only adjacent element Exchange time complexity is O (N2)

2. In order to have less time complexity than O (N2), the algorithm must have a long-distance element exchange

  

Today, we will further analyze the sorting algorithm, this time the analysis will be for the "Use comparison to sort" sorting algorithm , so far we have discussed all the sorting algorithm in this context. The so-called "use comparison to sort", that is, the algorithm to achieve sort by the elements to compare each other , such as the insertion of the ordering element compared to the previous element, if the inverse of the exchange position, such as a quick sort less than the elements of the hub is divided into a group, the elements larger than the hub is divided into another group. They all rely on "compare" to complete the sort work.

To analyze the algorithms for sorting using comparisons, we first introduce a concept: decision trees.

  The decision tree is such a two-prong tree: The root node of the tree represents "all possible sequences of elements", each edge of the tree represents "a possible comparison," and the child node of one edge is the "possible order of the parent node remaining after the comparison represented by that side." such explanations are difficult to understand, but there are many things that can be better with graphs:

  

is a three-dimensional ordering decision tree, the root node represents all the possible sequence, and from the root of the two side of the extension of the two "decision", or "comparison", after the "decision" can be obtained after the rest of the possible situation, such as the root of the left child is the decision to experience the "a<b" after the remaining possible. Obviously, leaves represent only one possible order.

Note that the decision tree does not represent any sort algorithm, that is, no sorting algorithm works like this. But the decision tree can give us a message that the algorithm that is sorted by comparison is essentially a comparison of the path of the decision tree from the root to the leaf.

Therefore, analyzing the average number of edges of this "path" is equivalent to analyzing how many comparisons are required on average by the sort algorithm used for comparison. This is also the difference between this analysis and (17), in (17) Our analysis of the sorting algorithm for the "number of exchanges", this time we analyzed the "comparison" number, and the comparison is obviously more critical, because regardless of whether the element is long-distance exchange, the comparison always exists.

To analyze the average comparison of the algorithms used for sorting using comparisons, we must be aware of the following theorems.

  Theorem 1: Two-tree with a depth of D, with a maximum of 2d leaves

The proof is simple: Two the depth of the fork tree D is the depth of the largest leaf in the binary tree D, if there is a leaf depth is not D, you can add two children under the leaf without changing the depth of the tree, so the depth of D two fork tree to have the most leaves will be full two fork tree, At this time there are leaves of 2d (the layer with a depth of D has a maximum of 2d nodes)

  Theorem 2: A two-fork tree with y leaves with a depth of at least [logy] (the base default is 2)

Proof: The theorem 1 can be directly introduced.

This proof may be a bit difficult to understand, we can comprehend by analogy: if 1 yuan can buy up to 5 sugar, then the minimum amount of 5 sugar need how much money? The answer is 1 yuan, which happens to be the inverse function of the relationship. Similarly, a two-tree with a depth of X has a maximum of y leaves, so what is the minimum depth of a two-fork tree with y leaves? The answer is X.

  Theorem 3:n The decision tree of the ordering of elements has n! a leaf node.

It is proved that the possible order of n elements is n!, and the leaves of decision trees are the "only possibility", that is, a possible order, so the decision tree of N-element ordering has n! leaves.

  Theorem 4: Sorting algorithm using element comparison requires at least O (logn!) Times comparison

Proof: By the theorem 2, there is a decision tree with y leaves, the depth is at least [logy], and the N-element ordering decision tree leaves number must be n!, so the N-element ordering decision tree depth is at least [logn!], that is, the N-element ordering decision tree of any leaf depth of at least [logn!], The depth of the leaf represents the number of edges passing from the root to the path of the leaf, which is the number of "comparisons", so theorem 4 is established.

  Theorem 5: Sorting algorithms using element comparisons require at least Omega (N*LOGN) comparison

Proof: Continuation of calculation according to theorem 4:

Logn!=log (n (N-1) * (N-2) *......*2*1)

=logn+log (N-1) +log (N-1) +......+log2+log1

>=logn+log (N-1) +......log (N/2)

>= (N/2) *log (N/2) = (N/2) *log (N*1/2) = (N/2) *logn+ (N/2) *log (1/2)

>= (N/2) *LOGN-N/2

=ω (N*LOGN)

  

Theorem 5 is the final result of our analysis, and we can make a generalization of theorem 5: Assuming there is an X possibility, the way to determine the specific situation is to constantly ask "yes or no" questions, then the cumulative number of times to ask is at least [LOGX].

So, according to Theorem 5, does the heap sort, merge sort, and quick sort already represent the quickest realm of sorting? No, because theorem 5 is still "limited", that is, by comparing the algorithm to sort, that is, not by comparison to complete the sort, it is possible to break through this boundary.

  What does it look like to do a sort without comparison? Here we can give a simple example: bucket sort. Its time complexity is O (N).

The idea of barrel sequencing in real life is not uncommon, for example:

Suppose we have a lot of coins, one, two, five, one, five, and one, and now we want to order them from small to large, what do we do? Manually simulating any sort algorithm can do the job, but no one would be so foolish. Most of the people's practice is: Prepare 6 "barrels", respectively, storage of the 6 kinds of coins, a penny into a bucket, a dollar into a dollar barrel, all the coins thrown into the barrel, and then in order to pour out from the barrel, the sort is done.

The idea of converting these ideas to a computer is this: assuming that our elements are natural numbers and must be less than Max, we just have to prepare max empty buckets, which defines an array of Bucket[max], and initializes them all to 0. Then traverse all the elements, if the element is I, then bucket[i] plus 1, the last statistic of the array bucket, you can draw the order of the elements:

//size of the array src, that is, the number of elementsvoidBucketsort (unsignedint*src,unsignedintsize) {    //Max is a macro that indicates that the element in SRC is not greater than or equal to a valueUnsignedintBucket[max] = {0 }; //"Throw the elements in the bucket."     for(unsignedinti =0; I < size;++i)++Bucket[src[i]]; //to "pour out" the elements in the bucket .Unsignedintj =0;  for(unsignedinti =0; I < max;++i) for(unsignedintx =0; x < bucket[i];++x) src[j++] =bucket[i];}

Obviously, the limitation of bucket sequencing is that the requirement element must be a natural number, there must be an upper bound and the upper limit is not too large, because the upper limit of the element determines the number of buckets, and the number of buckets is not how much to want, for example, my computer does not support the allocation of an array of size int_max.

Bucket sort also has a variant, only need 10 barrels, interested can go to search "bucket sort" or "cardinal sort", here do not introduce.

This post is the last blog post on sorting, and the next blog post, I will introduce the graph theory algorithm, it is not difficult, at least it is not difficult to understand.

Comprehensible data Structure C language version (22)--sorting decision tree and bucket sort

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.