◇ hours 10 & template 3◇ac automatic machine
Follow the high school class ... Talk about the extended use of AC automata. However, even KMP, Trie dictionary trees are not how to use my face Meng < (_ _) >
Spent the morning learning a bit about AC automata QwQ
? Trie Tree
A kind of dictionary tree (I heard there are other dictionary trees, not clear). Each node represents a letter, the root node is equivalent to a super source point, and the root node does not represent a letter. The most important feature of the trie tree is that it starts from the root node and walks down the edge of the tree, and the nodes that walk along form a string. Some nodes are the end of a word, and for such a node, we usually make a mark (OVR) for it.
? Building the Trie tree (build)
Based on the characteristics of the trie tree, the initial tree is an empty set that contains only the root node. When we want to insert a word into the tree, str, from the root node, if the root node has a str[0] son, then move to the son, or new to create a str[0] son, and then. And so, when we want to insert str[k], we should be in the first (k+1) layer of a node now (the root node is the first layer), if the node now has a str[k] son, then move, or first create a representation of str[k] son, then move. The entire word is not finished until it has been traversed. Assuming we end up with the node at now, then OVR can do two basic tokens: ① The node is the end of the number of words, ② the node is the end of which word ... Of course, if the topic has some strange requirements, you can use OVR to store some strange things, even more than a few OVR can be defined.
void Build (string str,int ID) {int len=str.length (), now=0;//Current node is now,trie[0] is the root node for (int i=0;i<len;i++) {if (!trie[ Now].son[str[i]-' a ']) trie[now].son[str[i]-' a ']=++cnt; CNT is similar to a pointer for creating a new node, (cnt+1) pointing to the nearest empty node now=trie[now].son[str[i]-' a ']; }trie[now].ovr=id; Make a mark, here is the stored trie[now] which is the end of the word}
? Relationship with AC automata
The AC automaton is built on the trie tree, but adds some edges around the KMP's fail function.
? KMP
A string matching algorithm, which is based on the naïve string matching algorithm, is optimized. To find the string B in string A, a is called a "main string", and B is a "pattern string", and when we try to match it, we find that the match fails, then we call it a mismatch.
There are two pointers when matching, I means starting from the I position of the main string, and J indicates that the pattern string matches to the first J position. When the naive algorithm is mismatched in the first position of the main string, J returns to 0, and I + 1, which continues to match from the first position of the pattern string from the next position of the main string, which creates a waste-the next match does not take advantage of the matching information that was previously mismatched.
The KMP algorithm is optimized.
? The principle of KMP algorithm
The KMP algorithm says "Do not need to slide the pattern string one position to the right", for example:
When the pattern string "ABCA" in the main string "ABCD" mismatch, we do not need to i++, because the next position of the main string is not ' a ', the gradual slide does not necessarily match. The KMP algorithm will move the main string to the right to the farthest possible position after the mismatch is found!
When a prefix of a pattern string is a true substring of a pattern string, we can move the pattern string directly to that position after the mismatch.
(Don't know how to explain, look at the 3 pictures above it)
? Fail function
In order to achieve the main string mismatch when the pointer does not backtrack, only adjust the pattern string pointer J, so that the pattern string slide to the right as far as possible, the definition of the mismatch function fail (j), in the pattern string when the J-character and the main string in the Si mismatch, in the mode string may and the main string Si match the position of
The Transfer type is: Fail[i]=①-1 (i=0), ②max{k|0<k<j, and P0 ... pk-1=pj-k+1 pj-1 (other cases).
? AC Automatic MachineInsert Word and trie tree are the same ( ̄▽ ̄) "? The end word count of the node is the same as the trie tree.? Get the Fail function
This is obtained from BFS. When the word is mismatch in the second layer of the dictionary tree, the fail must be 0 when the first character is mismatch. This means that the fail of the second-level node points to the root node. We push all nodes in the first layer into the queue, and then if the node u would have "a" +i son V, then the fail of V points to the "a" of the fail of U +i son, otherwise the "a" +i of the fail of the V is directed to U.
void Getfail () {queue< int > que;for (int i=0;i<26;i++)//Traverse Second layer if (Trie[0].son[i]) trie[trie[0].son[i]].fail=0, Que.push (Trie[0].son[i]), while (!que.empty ()) {int U=que.front (), Que.pop (), for (int i=0;i<26;i++)//Find son node if (trie[ U].son[i]) {//There is a +i son of "a" trie[trie[u].son[i]].fail=trie[trie[u].fail].son[i];//pointing to the father's fail "a" +i son Que.push (trie[u ].son[i]);} elsetrie[u].son[i]=trie[trie[u].fail].son[i];//direct son to father fail "a" +i son}}
? Recursion on the main string
set now is the node that is currently in place. Starting from the root node, now has an initial value of 0. Enumerate the main string characters str[i] from beginning to end, assigning now to the str[i] son of now. Then the fail pointer along now goes back to the root node, enabling each suffix of the traverse str[0~i] to be implemented. For each prefix of str, all suffixes are obtained, which is equivalent to finding all substrings of Str.
According to the question request statistic answer.
void Acquery (String str) {int len=str.length (); int now=0;for (int i=0;i<len;i++) {now=trie[now].son[str[i]-' a ']; Move nowfor (int j=now;j;j=trie[j].fail)//press fail pointer back ans[trie[j].ovr].num++; Statistical Answer}}
The EndThanks for reading!
-Lucky_glass
(Tab: If I have not clear the place can be directly in the mailbox [email protected] email me, on the weekend I will try to answer and improve the blog ~)
"UI Summary & Template Time" 10 & Template ·3 ac automata