Trie graph and Fail tree

Source: Internet
Author: User

The difference between trie diagram and AC automata

The trie diagram is the deterministic form of an AC automaton, which complements the next pointer that does not have a character for each node. The advantage of this is that you do not need the next pointer to be empty when constructing the fail pointer and need to keep backtracking.

For example constructs next[cur][i] The fail pointer, cur is the parent node, Next[cur][i] is the cur son node, if is the AC automaton, if the Father node TMP (TMP is cur a copy) Next[fail[tmp]][i] does not exist, You need to keep TMP back (that is, TMP = fail[tmp]) until Next[fail[tmp]][i] is not empty, let fail[next[cur][i] = Next[fail[tmp]][i].

If it is a trie graph, then let Fail[next[cur][i] []] = next[fail[cur]][i] on it, because the trie graph has completed the next pointer.

But either the trie or the AC automata, their fail pointers are pointing exactly alike. So the fail tree can be constructed either with the Trie diagram or the AC automaton. But the trie figure is much better than the AC automaton, so I've been writing trie instead of automata.

The nature of the fail pointer

To be able to use the fail tree, you first need to understand the nature of the fail pointer, so first of all, what are the properties of the fail pointer.

The fail pointer for each node points to its longest suffix, so it is important that the fail pointer of a node cur continuously go up and down until it encounters the root node, and that the string represented by the node that passes through the trace is the suffix of the string represented by the node cur.

What is the Fail tree

The first picture below is the AC automaton, and the second picture is the fail tree. The reason why the first picture is an AC automaton rather than a trie diagram is that trie Tutat is difficult to draw. But the exact principle remains the same.

You can see that the fail tree is actually the AC automaton's next pointer removed, and then reverse the fail pointer to the construction of the point, and it is certain that this must be a tree, so called the Fail tree.

One of the properties of the fail tree is that the string corresponding to a node must be the node of his son, the grandson node ... The suffix of the corresponding string.

Application of the Fail tree

If there are n strings, the lengths of all strings add up to no more than $10^6$, and there are M queries to query how many times the X string appears in the first Y string.

If you are using an AC automaton query, you can build an AC automaton directly on the string, and then let Y go to the AC automaton, and add 1 to the passing junction. So, to query how many times x appears in Y, start from the bottom and upload the weights along the fail pointer. Then just query the X-node's weights to see how many times X appears in Y. The complexity of each query is O (Tot+len[y]), where tot is the total number of nodes of the AC automaton.

If you are querying using the fail tree, then as long as you query the weights and the values of all the child nodes, the weights of the child nodes and the DFS sequence and the tree array can be used for maintenance. Then also let have to go to the AC automaton, will pass the node's weight value plus 1, but now is a tree-like array to maintain the weights. So to query how many times x appears in Y, as long as the interval query can be done, that is, as long as the query of all the nodes of the X node is good (according to the nature of the fail tree), because its DFS sequence number is continuous, so it is an interval query. You can sort the query by Y and then query with queries that have the same Y. The time complexity of each query is O (len[y]+log (tot)).

The article in my personal blog address is:http://www.alphaway.org/post-440.html

Trie graph and Fail tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.