Redis Research-3.3 data structure tree and find, sort, etc. (Follow-up)

Source: Internet
Author: User

3. Conversion between trees, binary trees, and forests

As we said before, the nodes in the binary tree can be represented as a data structure with left child, right child, parent, and self data domains, so can this be said for the general tree or the nodes in the forest? The answer is yes, it is represented as a binary tree node, and we can use some of the features and algorithms of the two-fork tree very well.

In the binary tree, left indicates that the child of the node, right is the child of the node, then, for the general tree node, if there is a child, the first child is the corresponding to the Ieft area, if there is a second, third child, etc., use right to form a linked list, then, This tree is converted to a two-tree, but the two-pointer field here is not the same. In fact, we can improve on the node to get the following representation:

The modified generic tree node represents the typedef struct gtbitreenode{    struct gtbitreenode *left;    struct Gtbitreenode *right;    void *data;    struct Gtbitreenode *next;//sibling node}gtbitreenode,*gtbitreenode;



3.1 Tree converted to two-fork tree

The above has been transformed the representation of tree nodes, then the general tree how to convert to our common two-fork tree it? Only three steps are required:

1). Add line. Add a connecting line between all the sibling nodes;

2). Go to the line. Each node in the logarithm retains only his connection to the first child node and removes the connection between his and other child nodes.

3). Level adjustment. Take the root of the tree as the axis, the whole tree clockwise rotation of a certain angle, so that the level is clear. It is important to note that the first child is the left child of the two Fork tree node, and the child that the brother converts is the right child of the node.

Let's use a diagram to show

We focus on how to adjust to the final level, we should first understand the principle of the third step adjustment:

The first child is the left child of the node, then B is of course A's left child. According to the second principle, the boy converted by the brother is the right child of the node, because C is the brother of B, so, after conversion, it becomes the right child of B. Similarly, since E is a child of B, it is certainly B's left child after conversion. Similarly, according to the second principle, F is the brother of E, so after the conversion, F becomes the right child of E, G was formerly the brother of F, and now becomes F's right child. Similarly, the first child of C in our previous tree is H, so now H is of course C's left child, likewise, because D is the brother of C, so now it is the right child of C. I was the first child of D in the previous tree, so it is now the left child of D, and because J was previously the brother of I, I now become the right child of I.

Through the above description, we should pay special attention to the second principle, that is, " Brothers and children into the node of the right child this statement ."

3.2 Forest converted to two fork tree

What is a forest? Forests, of course, are made up of a lot of trees. Well, of course we can think of each tree as a brother, so we can get the following conversion steps.

1). Convert each tree into a binary tree;

2). The first binary tree remains motionless, starting with the second binary tree, which in turn takes the root node of the latter tree as the right child of the root node of the previous binary tree and connects it with a line.

Use the diagram to illustrate:

OK, it should be clear, then, how can the binary tree be converted into a tree?

3.3 Two fork trees converted into trees

Before we have converted from the tree to a binary tree, he has to go through three steps, namely add line, go to line, adjust the level, then, the binary tree into a tree, that is, the process of a reverse process, how to do?

Still is

1). Add line. If the left child of the node is present, then the right child node of the left child, right child's right child node 、。。。 As the child node of this node, connect the node to these right child nodes.

2). Go to the line. Removes the connection to the right child node of all nodes in the original binary tree.

3). Level adjustment.

There is a picture of the truth.

So easy, isn't it?

3.4 Two fork trees converted into forests?

Whether a tree can be converted into a forest, judging by the standard is very simple, is to see the root node of the binary tree has no right child node, if there is, it can be converted. The conversion steps are:

1). From the root node, if the right child exists, then the right child and the single connection line is deleted, after separation, continue to iterate.

2). Convert each separated two-fork tree to a tree.

Guess a fool can see it too??? O (∩_∩) o haha ~

3.5 Huffman Code

I do not know when everyone in college has studied operations research, operations research is an important branch of dynamic planning (haha, in the next undergraduate is the mathematics department of Kazakhstan, then the operations research test of 67 points, low points drifting, but the highest score is 72). In some algorithms to solve optimization problems, each step is faced with a variety of options, dynamic programming is the killer algorithm of this problem, but sometimes it seems a bit cumbersome, so, at this time, we need a simpler, more efficient algorithm, greedy algorithm is such an algorithm, The core of the greedy algorithm is to make the best choice at that time at each step, or the local optimal choice, which comes to the last global optimal solution. Of course, this is only a hope, so the greedy algorithm does not guarantee that an optimal solution can be obtained. We will first learn a greedy algorithm-Huffman coding.

Before you say this, look at our real-life examples (This example comes from the big story data structure, please refer to). In other words, the teacher in the evaluation of the students "failed", "Pass", "Medium", "good", "excellent", is based on the student's score section to carry out, usually, we use the following structure to judge:

int degree (int score) {  if (score<60) {    printf ("%s", "failed");  } else if (score<70) {    printf ("%s", "Pass");  } else if (score<80) {    printf ("%s", "Medium");  } else if (score<90) {    printf ("%s", "good");  } else{    printf ("%s", "excellent");}  }



The resulting graphical structure is:

When we see that in the actual learning life, the student's grade ratio is shown below, we will feel that this algorithm is very problematic


Scores 0-59 60-69 70-79 80-89 90-100
Proportion 5% 15% 60g 30% 10%
, to see more than 70 of the student data, at least 3 of this comparison to make judgments, then, how to improve it?
int degree (int score) {  if (score<80) {if (score<70) {if (        score<60) {           printf ("%s", "failed");        } else{          printf ("%s", "Pass");        }    } else {        printf ("%s", "Medium");    }  } else if (score<90) {      printf ("%s", "good");  } else {    printf ("%s", "excellent");}  }



After this improvement, the score between 70-79 can be judged by up to two times, is it more optimized? The binary tree is represented as follows:

If, now there are 1000 students, then no improvement, the number of judgements required is 3,150 times, and the improvement, the number of times required to use is 2,200 times, the effect is obvious, especially when the volume of data is large.

To clarify what follows, there are several concepts that need to be clarified:

1). A branch from one node in the tree to another node makes up a path between two nodes, and the branching data on the path is called the path length (the path that goes through).

2). The path length of a tree is the sum of the path length from the root to each node. The path to tree A is: 1+1+2+2+3+3+4+4=20.

3). The weighted path length of a node is the product of the length of the path from the node to the root of the tree and the right on the node. The pass-through path in tree A is 15*2=30;

4). The weighted path path of the tree is the sum of the weighted path lengths of all the leaf nodes in the tree. The belt-weighted path of tree A is: 5*1+15*2+40*3+30*4+10*4=315;

5). With weighted path length WPL the smallest two-tree is Huffman tree.

So, how to build Huffman tree? Follow these steps

1). First, the leaf nodes with full-time are arranged into an orderly sequence in order from small to large.

2). Select the smaller two from this ordered sequence to construct a new two-fork tree, the smaller weight of the node as the new binary tree of the left child, the larger as the right child, the weight of the root node of the new two fork tree is the sum of the weights of the two children.

3). Remove the nodes of the two smaller weights that have been selected from the sequence and bring the root node of the new binary tree constructed in step 2 to the sequence.

4). Repeat steps 2, 3 to get the final Huffman tree.

Let's see how to construct a Huffman tree from tree B:

New Sort : N=15,b,d,c

Re-order : N=30,d,c

Re-order : c,n=60

The belt-weighted path of the tree after construction is =40*1+30*2+15*3+10*4+5*4=205;

And the original tree with the right path is: 5*3+15*3+40*2+30*2+10*2=220;

Let's figure out how many steps to take 3,050 times, rather than the Huffman tree algorithm is inefficient? So does this show the Huffman tree useless? No.

Let's look at it again:

Suppose I want to give you remote send a "badcadfeed" of the content, network transmission, is generally used in binary to represent, in this text appeared in the a,b,c,d,e,f these 6 characters, assuming that we have three-bit binary instead of a letter, you can get the following table:


A B C D E F
000 001 010 011 100 101

So, the code for this piece of content is: 001000011010000011101100100011, altogether 30 bits. But we found that in this section, the frequency of each letter appears different, and they appear in the frequency of:


A B C D E F
0.2 0.1 0.1 0.3 0.2 0.1

In other words, we can use Haffman to build a

We will change the weights to 0 and the right branch to 1:

So, the corresponding letter encoding mapping table is obtained:


A B C D E F
110 11110 11111 0 10 1110

Then, after using this encoding, the sending content should be: 111101100111111100111010100, altogether 27 bits.

In other words, we have improved, saving 10 of storage or transmission costs.

In fact, we see from the above, this is actually a variable length coding, the core idea is to use the high-frequency characters with the shortest code word to express, low-frequency with long code word to express.

In fact we can also get, we construct this binary tree, is a full two fork tree.

The correctness of Huffman coding, please refer to the translation of Yingjianping, Xu and other people, "Introduction to the algorithm-the third edition of the original book," Page No. 248.

Binary search tree: is on the basis of a binary tree to meet a feature of the two-tree is called the two-fork search tree, which is the tree of any node x, the left subtree of the key word cannot exceed the X keyword, the right subtree of the keyword can not be less than the keyword x.

Red-black Tree: A color attribute is added to each node of the binary search tree, which has two black and red values. The following features are also satisfied:

1). Each node is either red or black.

2). The root node is black.

3). Each leaf node is black;

4). If a node is red, then his two child nodes are black;

5). For each node, the same number of black nodes are included on the simple path from the node to all of its descendant leaf nodes.

Encountered other trees, and then to write, basically are on the basis of a binary tree added some restrictions

In the next section, you will continue to say sort, find

Welcome to shoot Bricks, but also can add qq:359311095 discussion

Redis Research-3.3 data structure tree and find, sort, etc. (Follow-up)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.