[AC automatic mechanism details + entry TEMPLATE] HDU 2222

Source: Internet
Author: User

The first time I wrote an AC automatic machine, I found that it was not too difficult to understand. Maybe I have read KMP & trie and understood it thoroughly before. Seeing that the automatic machine is not too difficult, because it also utilizes the KMP idea and understands KMP, it is not difficult to look at the automatic mechanism again :)

Let's take a look at the description of this blog. It seems to be easy to understand, but his code is not flattering... Http://www.cppblog.com/mythit/archive/2011/10/21/80633.html (changed)

The complexity of the automatic mechanism is O (m + n + Z), where Z is the total number of pattern strings in the main string and is suitable for multi-mode string matching.

Detailed description of AC automatic machine Algorithms

First, we will briefly introduce the AC automatic mechanism: Aho-corasick automation, which was generated in Bell Labs in 1975 and is one of the famous multimode matching algorithms. A common example is to give n words, and then give an article containing M characters, so that you can find out how many words have appeared in the article. To understand the AC automatic mechanism, you must first have the basic knowledge of the pattern tree (Dictionary tree) trie and KMP pattern matching algorithms. The AC automatic machine algorithm consists of three steps: constructing a trie tree and constructing the failure pointer and pattern matching process.
If you know about the KMP algorithm, you should know what the next function (shift function or fail function) in the kmp algorithm is. In KMP, we use two pointers, I and j, respectively. A [I-j +
1. I] is exactly the same as B [1. J. That is to say, I is constantly increasing. As I increases, J changes accordingly, and a string whose length ends with a [I] J exactly matches the first J characters of string B. When a [I + 1] is less than B [J + 1], the KMP policy is to adjust the J position (reduce the J value) so that a [I-j + 1 .. i] and B [1 .. j] The new B [J + 1] exactly matches a [I + 1], and the next function records the position where J should be adjusted. The failure pointer of the same AC automatic mechanism has the same function. That is to say, when our mode string matches on tire, if it cannot match the keyword of the current node, you should continue matching with the node pointed to by the failure pointer of the current node.
Let's take a look at the example below: Give five words: Say she SHR he her, and then give a string yasherhs. Ask how many words have appeared in this string. First, we need to specify some data structures required by the AC automatic machine to facilitate subsequent programming.

Const int kind = 26; struct node {node * fail; // failure pointer node * Next [kind]; // trie sub-nodes of each node (up to 26 sub-mothers) int CNT; // whether it is the last node of the word node () {// constructor initialization fail = NULL; CNT = 0; memset (next, null, sizeof (next) ;};}; char STR [55]; // The input word char ss [m]; // The input Main string queue <node *> QQ; // pointer for failed queue Construction

With these data structures, you can start programming:
First, construct the five words into a trie, as shown in-1.

 

// Create the trie tree void buildtree (node * root) {node * P = root; int I = 0, ID; while (STR [I]) {id = STR [I]-'A'; If (p-> next [ID] = NULL) {P-> next [ID] = new node ();} P = p-> next [ID]; I ++;} (p-> CNT) ++ ;}

After the trie is constructed, the next step is to construct the failure pointer. The process of constructing the failure pointer is summarized as one sentence: Set the letter on this node to C and follow his father's failure pointer until he reaches a node, his son also has a node with the letter C. Then point the failure pointer of the current node to the son of C. If the root fails to be found, point the failure pointer to the root. For specific operations, you only need to: First add the root to the queue (the root failure Pointer Points to itself or null). Then, every time we process a vertex, we add all its sons to the queue, the queue is empty.

// BFS failed pointer void BFS (node * root) {While (! QQ. Empty () QQ. Pop (); int I; root-> fail = NULL; QQ. Push (Root); node * TMP, * P; while (! QQ. empty () {TMP = QQ. front (); QQ. pop (); P = NULL; for (I = 0; I <26; I ++) {If (TMP-> next [I]! = NULL) {P = TMP-> fail; while (P &&! P-> next [I]) {P = p-> fail;} If (! P) TMP-> next [I]-> fail = root; else TMP-> next [I]-> fail = p-> next [I]; QQ. push (TMP-> next [I]) ;}}}

Observe the process of constructing a failure pointer from the code: see figure-2. First, the root fail Pointer Points to null, and then the root enters the queue and enters the loop. During the 1st cycle, we need to process two nodes: Root-> next ['H'-'a'] (node H) and root-> next ['s '-'a'] (node S ). Point the failure pointer of the two nodes to the root node and successively enter the queue. The failure Pointer Points to the two dotted lines (1) and (2) in Figure 2; after entering the cycle for 2nd times, the queue first pops up h, and then P points to the fail pointer of the H node to the node, that is, root; after entering the cycle of Row 3, P = p-> fail, that is, P = NULL, then exit the loop and point the Fail pointer of node e to the root, corresponding to (3) in figure-2 ), node E then enters the queue. When the first cycle occurs, the operation on node A displayed is the same as that on node e in the previous step. The fail pointer of node A is directed to the root, corresponds to (4) in figure 2 and enters the queue. When you enter the cycle for 4th times, node H (the one on the left in the figure) is displayed, and the operations are slightly different. When the program runs until 14 rows, Because p-> next [I]! = NULL (root has the son node H, the one on the right in the figure). In this way, point the failure pointer of the H node on the left to the son node H on the right, corresponds to (5) in Figure 2, and then enters the queue with H. And so on: After the loop ends, all the failure pointers are in the form shown in figure-2.

Finally, we can find the words in the mode string on the AC automatic machine. The matching process is divided into two situations: (1) the current character matches, indicating that there is a path along the tree edge from the current node to reach the target character. In this case, you only need to continue matching along the path to the next node, the target string pointer moves to the next character to continue matching. (2) If the current character does not match, the character pointing to the failed pointer of the current node continues matching. The matching process ends with the pointer pointing to the root. Repeat any of the two processes until the pattern string ends.

// AC automatic Host Program int ac_run (node * root) {int I = 0, ANS = 0, ID; node * P = root; while (ss [I]) {id = ss [I]-'A'; while (! P-> next [ID] & P! = Root) {P = p-> fail;} p = p-> next [ID]; If (! P) P = root; node * TMP = P; while (TMP! = Root & TMP-> CNT! =-1) {ans + = TMP-> CNT; TMP-> CNT =-1; TMP = TMP-> fail;} I ++;} return ans ;}

See figure 2 for detailed pattern matching process. The pattern string is yasherhs. For I = 0, 1. There is no corresponding path in trie, so no operation is performed. When I = 2, 3, 4, the pointer P goes to the lower left node E. Because the Count information of node e is 1, CNT + 1, and the count value of node e is set to-1, which indicates that the word has already been changed to prevent repeated counting, finally, temp points to the failed pointer of node e to continue searching, and so on. Finally, temp points to root and exits the while loop. In this process, count is increased by 2. 2 words she and he are found. When I = 5, the program enters row 5th, and P points to the node with its failure pointer, that is, the e node on the right, and then points to the r node in row 6th, the Count value of the r node is 1, so that count + 1 is collected until temp points to root. When I = 6, 7, no matching is found, and the matching process ends.


HDU
2222 questions
It can be used as a template question for AC automatic machines.

Paste code

# Include <map> # include <set> # include <list> # include <queue> # include <deque> # include <stack> # include <string> # include <time. h> # include <cstdio> # include <math. h> # include <iomanip> # include <cstdlib> # include <limits. h> # include <string. h> # include <iostream> # include <fstream> # include <algorithm> using namespace STD; # define ll long # define min int_min # define Max int_max # define PI ACOs (-1.0) # define fre Freopen ("input.txt", "r", stdin) # define FF freopen ("output.txt", "W", stdout) # define n 10005 # define M 1000005 const int kind = 26; struct node {node * fail; // failure pointer node * Next [kind]; // trie the child node of each node (up to 26 Child masters) int CNT; // whether it is the last node of the word () {// constructor initializes fail = NULL; CNT = 0; memset (next, null, sizeof (next) ;};}; char STR [55]; // The input word char ss [m]; // The input mode string queue <node *> QQ; // The queue Construction Failure pointer // create the trie tree void buildtree (node * root) {Node * P = root; int I = 0, ID; while (STR [I]) {id = STR [I]-'A '; if (p-> next [ID] = NULL) {P-> next [ID] = new node ();} p = p-> next [ID]; I ++ ;}( p-> CNT) ++ ;}// BFS failed pointer void BFS (node * root) {While (! QQ. empty () QQ. pop (); int I; root-> fail = NULL; QQ. push (Root); root-> fail = NULL; node * TMP, * P; while (! QQ. empty () {TMP = QQ. front (); QQ. pop (); P = NULL; for (I = 0; I <26; I ++) {If (TMP-> next [I]! = NULL) {P = TMP-> fail; while (P &&! P-> next [I]) {P = p-> fail;} If (! P) TMP-> next [I]-> fail = root; else TMP-> next [I]-> fail = p-> next [I]; QQ. push (TMP-> next [I]) ;}}// AC automatic Host Program int ac_run (node * root) {int I = 0, ANS = 0, ID; node * P = root; while (ss [I]) {id = ss [I]-'A'; while (! P-> next [ID] & P! = Root) {P = p-> fail;} p = p-> next [ID]; If (! P) P = root; node * TMP = P; while (TMP! = Root & TMP-> CNT! =-1) {ans + = TMP-> CNT; TMP-> CNT =-1; TMP = TMP-> fail;} I ++;} return ans ;} int main () {int t; scanf ("% d", & T); While (t --) {int N; scanf ("% d", & N ); node * root = new node (); While (n --) {scanf ("% s", STR); buildtree (Root);} BFS (Root ); scanf ("% s", SS); printf ("% d \ n", ac_run (Root);} return 0 ;}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.