Trie tree-dictionary tree (word search tree)

Source: Internet
Author: User

The trie tree, also known as the dictionary tree and word search tree. It is composed of four characters in the retrieval (search. Stores a large number of strings to support fast pattern matching. It is mainly used in information retrieval.

Trie has three structures: Standard trie (Standard trie), compression trie, and suffix trie (suffix trie ).

1. Standard Trie

Structure of the standard trie tree: All strings with a public prefix will be hung under the same node in the tree. In fact, trie is simply stored in all the common prefixes in the string collection. Add such a string set as X {bear, Bell, bid, Bull, buy, substring, stock, stop }. Its standard trie tree is as follows:

, The Blue Circle node is the internal node, and the Red Square node is the external node. We can clearly see the trie tree structure constructed by string set X. The string composed of all the characters from the root node to the leaf node in the red box is the string in the string set X.

Note: What if one string in the X set is the prefix of another string? For example, adding a string of Bi to X sets. The internal node I indicated by the Green Arrow of the trie tree should be marked as a Red Square node. In this case, two consecutive leaf nodes appear on the branches of a tree. This is not acceptable.

That is to say, the string set X does not contain a string that is the prefix of another string. How can we meet this requirement? We can add a special character # after each string in X (this character will not appear in the alphabet ). In this way, set X {bear #, bell #,... , Bi #, bid #} must meet this requirement.

In short, a standard trie with a storage length of N and from a set of S strings in the D alphabet X has the following properties:

1. Each internal node in the tree has at most D subnodes.

2. s external nodes in the tree.

3. The height of the tree is equal to the length of the longest string in X.

4. The number of knots in the tree is O (n ).

Search for the trie tree of the standard tree

For English word search, we can create a pointer array consisting of 26 elements in the internal node. To find a, you only need to find 0th pointers in the pointer array of the internal node (B = the first pointer, randomly located ). The time complexity is O (1 ).

Search process: if we want to find the string bull (B-u-l-L) in the above trie ).

1. Find the child pointer ("B"-"A" = 1) in the root node and find that the pointer is not empty, then go to Node B at the child node 1st.

2. Find the child pointer ("U"-"A" = 20) in Node B and find that the pointer is not empty, then, locate the node u at the child node 20th.

3. Locate the special character "#" on the leaf node, which indicates that the bull string is found.

If it is terminated at an internal Node during the search process, the string to be searched is not found.

Efficiency: For a string with n English letters, it takes O (d) Time to locate the pointer in an internal node. D indicates the size of the alphabet and 26 for English. In the preceding algorithm, the random array storage method is used for node pointer locating, so the time complexity is reduced to O (1 ). However, if it is a Chinese text, we still use o (d) here ). When the search is successful, a path is taken from the root node to the leaf node. Therefore, the time complexity is O (D * n ).

However, when the search for all strings in set X do not share the prefix, trie has the worst case. In addition to the root, all internal nodes are free from one subnode. At this time, the search time complexity is reduced to O (D * (N2 ))

 

Package trie; Enum nodekind {ln, BN};/*** trie node * @ author Ning **/class trienode {char key; trienode [] points = NULL; nodekind kind = NULL;}/*** branch node */class branchnode extends trienode {public branchnode (char K) {super. key = K; super. kind = nodekind. bn; super. points = new trienode [27]; // todo auto-generated constructor stub}/*** leaf node */class leafnode extends trienode {public l Eafnode (char key) {super. key = key; super. kind = nodekind. ln; // todo auto-generated constructor stub} public class standartrie {// create root node trienode root = new branchnode (''); // insert a word public void insert (string words) {trienode curnode = root into the dictionary tree; // add' # 'as an end symbol words = words + "#"; char [] chars = words. tochararray (); For (INT I = 0; I <chars. length; I ++) {If (chars [I] = '#') {Curnode. points [26] = new leafnode ('#');} else {int psize = chars [I]-'A'; // if it does not exist, create a new branch node if (curnode. points [psize] = NULL) {curnode. points [psize] = new branchnode (chars [I]);} curnode = curnode. points [psize] ;}}// check whether a word is public Boolean fullmatch (string words) {trienode curnode = root; char [] chars = words in the dictionary tree. tochararray (); For (INT I = 0; I <chars. length; I ++) {int psize = Chars [I]-'A'; system. out. println (chars [I] + "->"); If (curnode. points [psize] = NULL) return false; curnode = curnode. points [psize];} If (curnode. points [26]! = NULL & curnode. points [26]. key = '#') return true; return false;} // traverse public void preordertraverse (trienode curnode) {If (curnode! = NULL) {system. out. println (curnode. key); If (curnode. kind = nodekind. BN) {for (trienode node: curnode. points) {preordertraverse (node) ;}} else {system. out. println () ;}}// get root public trienode getroot () {return this. root ;}}

The next two dictionary trees will be updated later!

Trie tree-dictionary tree (word search tree)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.