Algorithm-8. Two fork Find tree

Source: Internet
Author: User

1. Basic Ideas

We will learn a symbol table implementation that combines the flexibility of linked list insertion with the efficiency of an ordered array lookup. Specifically, the use of each node with two links (each node in the linked list contains only one link) of the two-fork lookup tree to efficiently implement the symbol table, which is one of the most important algorithms in computer science.

Definition: A binary search tree (BST) is a binary tree in which each node contains a comparable key (and associated value) and the key of each node is greater than the key of any node in its left subtree and the key of any node in the small less right operand subtree.

2. Specific algorithms

/** * Algorithm 3.3 symbol table based on binary lookup tree * Created by Huazhou on 2015/12/1. */public class Bst<key extends Comparator<key>, value> {    private Node root;  root node of binary lookup tree    private class node{        private key key;    Key        private Value Val;  Value        Private Node left, right;   Link to subtree        private int N;  The total number of nodes in the subtree rooted in the node public        node (key key, Value val, int N) {            this.key = Key;            This.val = val;            This. n = n;        }    }    public int size () {        return size (root);    }    private int size (Node x) {        if (x = = null) {            return 0;        }        else{            return X.N;        }    }        Public Value get (key key) {        //See cont    . 1} public        void put (key key, Value val) {        //See cont. 1    }}

The Zolin points to a two-fork lookup tree that consists of all the keys that are smaller than the node, and the right link points to a two-fork lookup tree that consists of all the keys that are larger than that node. The variable n gives the total number of nodes of the subtree with the root of the node. The private method implemented in the above algorithm size () treats the value of the null link as 0, so that we can guarantee that the following formula is always true for any node x in the binary tree.

Size (x) =size (x.left) +size (x.right) +1

A binary lookup tree represents a set of keys (and their corresponding values), and the same collection can be represented by a different two-fork lookup tree (as shown). If we project all the keys of a binary lookup tree to a straight line, to ensure that the keys in the left subtree of a node appear on its left and that the keys in the right subtree appear on the right side of it, then we can definitely get an ordered key column. Using this natural flexibility of binary lookup trees, we use multiple binary lookup trees to represent the same set of ordered keys to achieve efficient algorithms for building and using a two-fork lookup tree.

/*** Cont. 1*/public Value Get (key key) {return get (root, key);            The private value get (Node x, key key) {//) finds and returns the value corresponding to the Key in the subtree that is the root node of x//If it is not found, returns null if (x = = null) {        return null;        } int cmp = Key.compareto (X.key);        if (CMP < 0) {return get (x.left, key);        } else if (cmp > 0) {return get (x.right, key);        } else{return x.val;    }} public void put (key key, value Val) {//Find key, find update its value, otherwise create a new node Root = put (root, Key, Val); } private node put (node X, key key, value Val) {//If key exists in a subtree with x as the root node, the value is updated;//Otherwise a new node with key and Val values will be inserted into the        The subtree if (x = = null) {return new Node (Key, Val, 1);        } int cmp = Key.compareto (X.key);        if (CMP < 0) {X.left = put (X.left, Key, Val);        } else if (cmp > 0) {x.right = put (X.right, Key, Val);  } else{X.val = val;      } X.N = Size (x.left) + size (x.right) + 1;    return x; }

In general, finding a key in the symbol table may result in two results. If the node that contains the key exists in the table, our lookup is hit, and then the corresponding value is returned. Otherwise, the lookup misses (and returns null). Recursive structure based on data representation we're going to get a recursive algorithm for finding a key in a binary lookup tree: If the tree is empty, the lookup is missing, and if the key found is equal to the root node, the lookup hits, otherwise we (recursively) continue to look in the appropriate subtree. If the key you are looking for is small, select the left subtree, and the larger is the right subtree. This algorithm is fully implemented by the recursive get () method in the algorithm continuation 1. Its first parameter is a node (the root node of the subtree), and the second parameter is the key to be looked up. The code guarantees that only the subtree represented by the node will contain nodes that are equal to the keys being looked up. And the interval that is found after each iteration in the binary lookup is halved, and as we continue to look down in the binary lookup tree, the size of the subtree represented by the current node is also decreasing (ideally halving, but at least one node). The process ends when it finds a node (hit) with the key being looked up or the current subtree becomes empty (misses). Starting at the root node, the process found in each node is recursively expanded on one of its sub-nodes, so a lookup also defines a path to the tree. For a hit lookup, the path ends at the node containing the key being looked up. For a missing lookup, the end of the path is an empty link, as shown in.

The lookup code in the algorithm continuation 1 is almost as simple as the binary lookup, and this simplicity is one of the important features of the two-fork lookup tree. Another more important feature of the binary lookup tree is the difficulty of inserting and finding the same. When looking for a node that does not exist in the tree and ends up with an empty link, all we need to do is point the link to a new node that contains the key being looked up (see). The implementation logic of the Recursive put () method in the algorithm continuation 1 is similar to the recursive lookup: If the tree is empty, it returns a new node with the key value pair, and if the key found is smaller than the root node, we will continue to insert the key in the left subtree, otherwise insert the key in the right subtree.

is a detailed study of our standard index use case trajectory, which shows you how a two-prong tree grows. The new node is connected to an empty link at the bottom of the tree, and the rest of the tree does not change. For example, the first inserted key is the root node, and the second inserted key is one of the two child nodes of the root node, and so on. Because each node contains two links, the tree grows rather than shrinks. Not only that, because only the nodes on the search or insert path will be accessed, so as the tree grows, the number of nodes that are visited will also decrease in the percentage of the tree's summary points.

3. Algorithm Analysis

The run time of an algorithm that uses a two-fork find tree depends on the shape of the tree, and the shape of the tree depends on the order in which the keys are inserted. In the best case, a tree with n nodes is perfectly balanced, and the distance between each empty link and the root node is ~LGN. In the worst case, there may be N nodes on the search path. As shown in. In general, however, the shape of the tree is closer to the best of the situation.

For many applications, the simple model shown is applicable: we assume that the distribution of the keys is (evenly) random, or that their insertion order is random. For the analysis of this model, binary search tree and quick sort are almost "twins". The root node of the tree is the first shard element in the quick sort (the key on the left is smaller than it, the key on the right is larger than it), and this applies to all subtrees, which corresponds exactly to the recursive ordering of the subarray in the quick sort. This allows us to analyze some properties of the two-fork lookup tree.

Proposition: In a two-fork lookup tree constructed from n random keys, the number of comparisons required to find the hit average is ~2lnn (about 1.39lgN).

Proof: A hit lookup that ends at a given node has the required number of comparisons for the depth of the lookup path plus 1. If we add up the depth of all the nodes in the tree, we can get the internal path length of a tree. Therefore, the average number of comparisons in a binary lookup tree is the average internal path length plus 1. The average cost of finding a hit is (1+cn/n) when the CN is the internal path length of a two-fork lookup tree constructed by N randomly ordered different keys. We have c0=c1=0, and for n>1 we can get an inductive relationship directly based on the recursive structure of the binary lookup tree:

cn=n-1+ (c0+cn-1)/n+ (c1+cn-2)/n+...+ (cn-1+c0/)/n

Where N-1 this represents a root node that adds 1 to the path of all N-1 non-root nodes in a tree. The other items of the expression represent all subtrees, and they are computed in the same way as a two-prong lookup tree of size n. After we have organized the expressions, we will find that this inductive formula is almost identical to the formula we obtained in the previous quick order, so we can also get cn~2nlnn.

Proposition: The number of comparisons required to insert operations and find misses in a two-fork lookup tree constructed from n random Keys is ~2lnn (about 1.39lgN).

Proof: An additional comparison is required for the insert operation and the find misses average than the lookup hit. This is not difficult to obtain by inductive method.

4. Summary

The first proposition shows that the cost of finding a random key in a binary lookup tree is about 39% higher than the binary lookup, and the second proposition illustrates that these additional costs are worth it, because the cost of inserting a new key is the number of levels--a flexibility that is not available with ordered arrays based on binary lookups, Because the number of times it needs to access the array is the linear level of its insert operation. As with quick sort, the standard deviation of the comparison is very small, so the more n the greater the more accurate the formula.

" Source Download "

Algorithm-8. Two fork Find tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.