AC Automatic Machine

Source: Internet
Author: User

About automata, in fact, can be the simplest to understand that for a given initial state, the algorithm can be automatically recursive to obtain a final match or mismatch between the two cases. One of the AC Automatic test automata (Aho-corasick Automation), which was produced in Bell Labs in 1975, is one of the well-known multimode matching algorithms. The so-called multimode matching, refers to the simultaneous matching of multiple pattern strings, we usually use the string matching KMP algorithm is actually a special case of AC automata, it only in a time to match a pattern string. Understanding this relationship makes it relatively easy to understand the AC automata.

Specific operation:

1, construct a trie of all strings to be matched as the search data structure of the AC automata. For example, 5 words, Say,she,shr,he,her. The structure of the trie should be (image from the network).

2, constructs a fail pointer, which jumps to the character with the longest public prefix to continue matching when the current character is mismatched. Similar to next in the KMP algorithm, the AC automaton uses the fail pointer if the current character match fails.

3, after constructing the trie and the failure pointers, we can scan the main string. This process is similar to the KMP algorithm, but there are some differences, mainly because the AC machine is multi-string mode, need to prevent the omission of a word, so the introduction of the temp pointer.

The matching process is divided into two situations: (1) The current character matches, indicating that there is a path from the current node along the edge of the tree to reach the target character, at this point only along the path to the next node to continue matching, the target string pointer moves downward character continue to match, (2) The current character does not match, The character that the current node failed pointer points to continues to match, and if the final pointer points to root, there is no matching string. Repeat any one of these 2 processes until the pattern string goes to the end.

In contrast, take a look at the detailed process of pattern matching, assuming that the main string is YASHERHS, we need to find the number of times the pattern string appears in the main string. For i=0,1. There is no corresponding path in the trie, so no action is done; when i=2,3,4, the pointer p goes to the lower left node E. Because the count information for node E is 1, so count+1, and set the count value of node E to-1, indicates that the word has already been changed, prevents the repetition of the count, and finally the node to which the failed pointer to the e-node is pointing continues to find, and so on, and finally temp points to root, Exits the while loop, in which count increases by 2. The expression found 2 words she and he. When I=5, the program enters line 5th, p points to its failed pointer node, which is the E node on the right, and then to the R node on line 6th, the R node has a count value of 1, thus count+1, looping until temp points to root. At the end of the i=6,7, no match was found and the matching process ended.

AC Automatic Machine

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.