Comprehensible data Structure C language version (12)--Find from binary to two fork tree

Source: Internet
Author: User

In many books or articles about data structures and algorithms, the author tends to talk about what is a two-tree balabala of what is a tree and then straight into the topic. But today I decided not to follow this routine. I personally think that a thing or a technology exists always have a certain reason, not to solve a problem, is to improve the efficiency of solving a problem. Learning new knowledge is easier to accept and easier to understand if you can first understand the problems that exist and the shortcomings of existing solutions.

Fortunately, the explanation of the binary tree can be carried out in the order described above. So, today, before we discuss the binary tree, let's talk about a situation, an operation: Suppose there is an array, the data in the array is sorted by a keyword, and now we want to determine if a data already exists in the array, what if it is faster? (for convenience, we assume the array is an int a[n], the data is an integer)

The simplest and most straightforward approach is to start from a[0] until a[n-1], comparing the elements in the array to the target data, and in the worst case (the data does not exist in the array or the data is a[n-1]), we have to compare n times to end the lookup operation.

So, do we have a faster algorithm to do this? Yes, that's binary search. The so-called binary search, is in the ordered table, each time to be looked up data and the current table in the middle node to compare, according to the comparison results "discard" out of the middle node divided by a half of the sub-table.

The above statement may be difficult to understand (I find it difficult to understand, the ╮ (╯_╰) ╭), but we can give an example of life to understand what is a binary search. Suppose there is now a commodity price is 1-100 yuan between the integer, please guess its price and guess the number of times can not exceed 10 times, how can I guess? The most foolish way is to use similar to the order of comparison, starting from 1 a guess, so guess the worst case needs to guess 100 times to guess, only 10% of the probability can be guessed within 10 times. But we all know a way, can be guessed in 10 times, that is to guess 50, if "big" then continue to guess 25, if "small" then continue to guess 75, each time is to guess "middle node", and then according to the result "discard" a half of the sub-table, such as guess 50 o'clock "big" we "abandon" The half of the sub-table from 51 to 100. With this two-point search, we can guarantee that the target price will be found at most 7 times.

intBinarySearch (int*a,intNinttarget) {     intleft=0; intright=n-1; intMiddle= (left+right)/2;  while(left!=Right ) {           if(a[middle]==target)returnMiddle; if(a[middle]<target) left=Middle; if(a[middle]>target) right=Middle; Middle= (left+right)/2; }     return-1; }

as a digression, let's figure out how the time complexity of the binary lookup is.

For binary search, we know two points: 1. Each time we divide the size of the table by half, that is, by 22. Worst case scenario We're going to "divide by 2" until the table has only one element left.

Combining these two points we will find that the time-critical point of the binary lookup is how many times compared, and the number of comparisons in the worst case is that the size of the table is constantly divided by 2 to N to 1 times. So we're going to be able to find the time complexity of binary lookups quickly: O (log2N)(you can also omit the base 2 as O (logn))

Well, this is the end of the introduction to binary search. Next we explore a new question: What if we want the size of the table to change dynamically? a no-brainer answer is to use a doubly linked list (because sometimes we need to look forward, so bidirectional is necessary). But in fact the list (whether or not bidirectional) is not suitable for binary search at all. Why is it? Because in the array, we can use the subscript to find the intermediate node when we compare the middle node with the data being looked up, but in the list, if we want to compare the 50th node and the data we are looking for, we have to "pass" the first 49 nodes, that is to say, although the comparison operation is still one time, But operations like Middle=middle->next will change with the location of the "middle node", and in the worst case scenario (assuming that the data you are looking for is at the end of the table), we will have to "go through" the entire list, which is similar to middle=middle-> Next operation we have to execute n times, the time complexity of the final algorithm is still O (n).

So, is there a data structure that can support both binary lookup and dynamic change size? Fortunately, there is a binary tree.

In learning the normal tree, we have no restrictions on how many children each node has, while the binary tree is a tree that limits the number of children each node can have. in a binary tree, a node has a maximum of two children, which is the reason for the name "Two forks". (for a binary tree)

        

Since the maximum number of children per node is fixed, we can change the node definition to the following:

struct Binarytreenode {    int  data;     int frequency;   // When the insertion data is the same as the current node +1, when the current node is deleted -1    struct Binarytreenode *left;  // pointing to the left child    struct Binarytreenode *right;  // Right Child  *binarytree;

The binary tree obviously supports the dynamic size of the change (of course, its deletion is not the same as the file system tree we simulated between us), and what we need to understand is why it can support binary lookups. As I said before, the reason why the list "does not support" binary lookup is that we have to "go through" the nodes in front (or behind) when we visit an intermediate node, so how does the two tree avoid this process? Very simple, let's take a step-by-step look.

A little analysis of binary search, we are not difficult to find that the basic requirement of binary search is the data order . If we want to allow a two-fork tree to support binary lookups, we have to make the data in the two-fork tree "orderly". So how is the data "orderly" in the two-fork tree? It is very simple, that is, each node satisfies all the nodes in the left subtree are less than this node, all nodes in the right subtree are greater than this node. After this "transformation" of the two-fork tree, is what we call the two-fork search tree. for the above data volume of 100, let's assume that there is such a binary search tree

A little look at this tree will find that binary search can be "perfectly" applied to it, when we enter the tree with the given data, the first access is the middle node in the original table, and if the given data is greater than that node, then we look for the right subtree of the node (at which point we will have exactly access to the middle node of the child table), Instead of looking for the left subtree of the node, and we keep this practice for each node, we can eventually find the target data, or the target data does not exist because there is no child at the node and the node is non-target data.

Therefore, the code for binary lookup tree is as follows:

BOOLSearchbytree (BinaryTree T,intdata) {    if(T = =NULL)return false; if(T->data = = data&&t->frequency >0)        return true; if(Data < t->data)returnSearchbytree (t->Left , data); if(Data > T->data)returnSearchbytree (t->Right , data); return false;}

To satisfy the nature of the binary search tree, the code to insert the node should be as follows:

BinaryTree Insert (BinaryTree t,intdata) {    if(T = =NULL) {BinaryTree temp= (BinaryTree)malloc(sizeof(Binarytreenode)); Temp->data =data; Temp->frequency =1; Temp->left = Temp->right =NULL; returntemp; }    if(Data < t->data) T->left = Insert (t->Left , data); Else if(Data > T->data) T->right = Insert (t->Right , data); ElseT->frequency++; returnt;}

Now that we have a way to implement the insertion, how do we implement the perfect two-fork search tree that is assumed above? Unfortunately, if you want this perfect (perfect binary search) binary lookup tree, you can only first sort the data in the array, then insert the array elements into the two-fork tree one by two in the order of the binary lookup, and you can no longer insert new nodes, because that will break the perfect feature. But if we do, why do we need a two-fork search tree? So when actually used, we tend to insert random data directly into a two-fork tree, so that the last generated two-fork find tree has the potential to be long like this (left and right severe imbalance)

Such a two-fork search tree has not been able to match the perfect binary search, but! Binary search tree is still very good, and it also realizes the quick Find and dynamic change size . So binary search tree is still a very meaningful data structure (in fact, binary tree and other applications, such as Huffman code, interested can go to look at).

Next, let's look at how to implement the delete operation in the two-fork lookup tree.

Before we defined the node, we reserved a frequency field for the node, and when the new data we wanted to insert already existed in the tree, we added the frequency of the corresponding node to represent the actual number of the data in the tree. So we can also take advantage of this frequency when we delete nodes, that is, when we delete nodes, we decrease frequency. Even if the frequency is reduced to 0, we still retain the node. Such a removal implementation we call "lazy deletion", the advantage is that the implementation of simple, fast, the disadvantage is that the node needs additional space.

//Lazy DeletevoidLazydelete (BinaryTree T,intdata) {    if(T = =NULL)return; if(T->data = = Data && t->frequency >0) T->frequency--; if(Data < t->data)returnDeletenode (t->Left , data); Else        returnDeletenode (t->Right , data);}

If you want to actually delete the node, it will be a little more troublesome. First of all, we have to understand that the deletion of the node may have three states: no children, there is a child, there are two children.

  in the case of No child , we can directly release the deleted node.

  for cases where there is only one child , we make the child substitute for its location, and then release the deleted node. (assuming delete node 4)

  

  for cases with two children , handling is a little more troublesome, as we also maintain the basic features of the binary search tree. So there are two types of operations we can choose from:

  Modifies the data of the deleted node to the largest node in the left Dial hand tree, then deletes the largest node in the left subtree;

Modifies the data of the deleted node to the smallest node in the right subtree, and then deletes the smallest node in the right subtree.

Suppose we choose to substitute the smallest node of the right subtree for the deletion of the Node 2 signal.

  

Now that we know how to deal with the above three scenarios, we can write the code to delete the node.

BinaryTree Deletenode (BinaryTree t,intdata) {    if(T = =NULL)returnT; //If the lookup data is less than the current node    if(Data < t->data) T->left = Deletenode (t->Left , data); //If the lookup data is greater than the current node    Else if(Data > T->data) T->right = Deletenode (t->Right , data); //If T has two children    Else if(t->left&&t->Right ) {T->data=getmindata (t->Right ); T->right=deletenode (t->right,t->data); }        //otherwise t has only one child or no children, we can use the following code to deal with    Else{BinaryTree temp=NULL; (T->left)? temp = T->left:temp = t->Right ;  Free(t); returntemp; }    returnt;}

We did not write the implementation of Getminnode () in the above code, the implementation of this function is not difficult, but we still have to mention it, why? Because we are in Getminnode (), we have actually "walked to" the smallest node of the right subtree, but we did not delete it, but in the back call Deletenode (t->right,t->data) to remove it! That means we walked two times on this path! so the improvement is to write a special deletemin () function, so that it returns the minimum node data of the subtree while deleting the node.

BinaryTree deletemin (BinaryTree t,int*mindata) {    //If the current node has no left child, it must be the smallest node .    if(T->left = =NULL) {        *mindata = t->data; BinaryTree Temp= t->Right ;  Free(t); returntemp; }    //If the current node has a left child, the smallest node in its left subtree is deleted .    ElseT->left = Deletemin (t->Left , mindata); returnt;}

Then change the code in Deletenode () when T has two children as follows:

Else if (t->left&&t-> right) {    int  mindata;    T->right = deletemin (t->right,&mindata);    T->data = mindata;}

In this way, we just walk through this path.

Delete a node when deleting a node there are two children of the solution, we use a little bit of skill (the smallest node of the right subtree) to ensure that the nature of the two-fork search tree is not changed. So, can we use some kind of technique to make the whole binary tree keep the balance of the left and right sub-trees (i.e., the depths as closely as possible) in the continuous insert and deletenode? the answer is YES! In the next blog post, we'll cover what is a balanced binary tree. It can be very well applied to often find, occasionally inserted, deleted in the context!

Finally, a simple comparison of unordered array lookup and two-fork lookup tree Lookup Efficiency Program, the efficiency of the two can be significantly different when the given amount of data is 1000 (the root cause is that every non-target node in the program is paused for 1 milliseconds, if not, The amount of data required on my computer is 10 million to compare the efficiency difference between the two.

Https://github.com/nchuXieWei/ForBlog------Binarysearchtree

  

Comprehensible data Structure C language version (12)--Find from binary to two fork tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.