[Pku1_1 + pku2778] AC Automatic Machine Dynamic Planning

Source: Internet
Author: User

Before talking about the subject, let's talk about AC automatic machines.

The desire to learn what an AC automatic machine is coming when you see something so auspicious about AC.

The full name of the AC automatic machine, Aho-corasick automation, is quite long, but the experience tells us that the longer the name, the shorter the code is, so do not get stuck with such a tiger name.

Now, go to the topic.

An AC automatic machine is an algorithm used for multi-string mode matching (not a magic data structure). It is almost the same as trie + KMP.

The idea of this algorithm is very simple. You can get a next pointer on the letter tree (equivalent to the next function in KMP ), then, for each article, we can use this letter tree to get the words that appear (it seems difficult to count the number of occurrences, this is probably the limitation of this algorithm)

The key to this algorithm is to calculate next [I]. The method is as follows:

For each node I, if it is root, next [I] = 0; otherwise

Find the parent chain in my father's next (that is, all the way J: = next [J], obviously it cannot be counted as my father) find the vertices with the I character in the son of the first vertex,

Next [I] = This son. If not, next [I] = root

This is simple. If it is implemented, extensive search is enough.

Then, I gave an article about how to match him with the given word. We can simply use the letter tree and next pointer we just made (like KMP)

Pay special attention to the fact that every time a matching digit is matched, it is necessary to go back to the next link to find out whether there are words that match exactly in this position, and the words that have already appeared, we can simply remove its mark (the following code contains !! Note)

There is a small change in implementation to make the code more concise, that is, to assign C [0, I] In the letter tree to 1, then when the next chain ends, then directly use C [0, I] to return to root (my root = 1), so you don't need to (k <> root) and... it looks pretty.

The following is my ac_auto template.

Program syj; {Aho-corasick automation-used to O (n) Perfix} <br/> const <br/> maxt = 500000 + 5; <br/> type <br/> stype = ansistring; <br/> var I, j, k, n, m, T, ANS, task: longint; <br/> CC: Char; <br/> St, A: stype; <br/> word, next, Q: array [0 .. maxt] of longint; <br/> C: array [0 .. maxt, 'A '.. 'Z'] of longint; <br/> procedure BFS; <br/> var St, Ed, I, J: longint; <br/> CC: Char; <br/> begin <br/> ST: = 0; ED: = 1; Q [1]: = 1; <br/> while st <E D do begin <br/> Inc (ST); I: = Q [st]; <br/> If I = 3 then <br/> I: = I; <br/> for CC: = 'A' to 'Z' do if C [I, CC]> 0 then begin <br/> Inc (ed ); Q [ed]: = C [I, CC]; J: = next [I]; <br/> while (j> 0) and (C [J, CC] = 0) Do J: = next [J]; <br/> next [C [I, CC]: = C [J, CC]; <br/> end; <br/> procedure ins (K: longint); <br/> var I, j: longint; <br/> begin <br/> I: = 1; <br/> for J: = 1 to length (ST) DO <br/> If C [I, st [J]> 0 N I: = C [I, St [J] <br/> else begin <br/> Inc (t); C [I, St [J]: = T; I: = T; <br/> end; <br/> word [I]: = 1; <br/> end; <br/> begin <br/> assign(input,'input.txt '); reset (input); <br/> assign(output,'output.txt'); rewrite (output); <br/> for CC: = 'A' to 'Z' do C [0, CC]: = 1; <br/> readln (n, m); T: = 1; <br/> readln (a); <br/> for I: = 1 to n do begin <br/> readln (ST); ins (I ); <br/> end; <br/> BFS; J: = 1; <br/> for I: = 1 to M do be Gin <br/> CC: = A [I]; <br/> while C [J, CC] = 0 Do J: = next [J]; <br/> J: = C [J, CC]; K: = J; <br/> while K> 1 do begin //!! <Br/> ans: = ans + word [k]; word [k]: = 0; <br/> K: = next [k]; <br/> end; <br/> end; <br/> writeln (ANS); <br/> close (input); <br/> close (output); <br/> end. <br/>

The first time I wrote this call, it took several hours to pass (that is !! There has always been a problem), and later I found that every question using the AC automatic machine should pay attention to this!

 

After speaking about the algorithm, let's take a look at the question.

Pku1_1

A number of words and an article must be modified at least to exclude any word. The letter is only acgt.

Analysis:

The direction of DP is obvious, but it is a little difficult to design the status (I should not be able to do it before learning the AC automatic mechanism)

Then I will be the master. I will consider how to write on the letter tree of the automatic machine.

It can be thought that if the first I bit of the original string is matched and the J node of the letter tree is reached now, as long as the I and j nodes are available, the entire matching status will be displayed without any delay, then the algorithm comes out.

Algorithm:

Note:

F [I, j] indicates that the first I bit of the original string has been matched. Currently, the number of nodes J on the letter tree must be changed at least, if you want to transfer the I + 1 digit to another node or do not change it, you can use the above algorithm to push it to the next state. Note that the node that cannot be transferred must not match any word.

Boundary f [0, 1] = 0, answer min {f [Len, I]} (word [I] = false)

It should be noted that every time you move to a node, You have to scan the next link to see if word [I] is true, if there is one, it cannot be transferred (as mentioned above). There are also troublesome output formats and ansistring (I wa 3 times, once ansistring). You have to remember to clear the array for multiple groups of data.

Code:

Program syj; {DP + ac_automation} <br/> const <br/> maxl = 1000 + 5; <br/> maxt = 20*50 + 5; <br/> oo = maxlongint SHR 1; <br/> var N, I, J, K, KK, L, Len, T, ANS, task, R: longint; <br/> U: array ['A '.. 'Z'] of longint; <br/> ST: ansistring; <br/> OK: Boolean; <br/> word: array [0 .. maxt] of Boolean; <br/> next, Q: array [0 .. maxt] of longint; <br/> F: array [0 .. maxl, 0 .. maxt] of longint; <br/> C: array [0 .. maxt, 1 .. 4] of longint; <br/> procedure new; <br/> var I, j: longint; <br/> begin <br/> J: = 1; <br/> for I: = 1 to length (ST) DO <br/> If C [J, U [st [I]> 0 then J: = C [J, U [st [I] <br/> else begin <br/> Inc (t); C [J, U [st [I]: = T; J: = T; <br/> end; <br/> word [J]: = true; <br/> end; <br/> procedure BFS; <br/> var St, Ed, I, J, K: longint; <br/> begin <br/> ST: = 0; ED: = 1; Q [1]: = 1; <br/> while st <Ed do begin <br/> Inc (ST); I: = Q [st]; <br/> for J: = 1 to 4 do if C [I, j]> 0 then begin <br/> Inc (ed); Q [ed]: = C [I, j]; K: = next [I]; <br/> while C [K, J] = 0 do K: = next [k]; <br/> next [C [I, j]: = C [K, J]; <br/> end; <br/> begin <br/> assign (input, 'pku1_1. in '); reset (input); <br/> assign (output, 'pku1_1. out'); rewrite (output); <br/> U ['a']: = 1; U ['C']: = 2; U ['G']: = 3; U ['T']: = 4; <br/> readln (n); task: = 0; <br/> while n> 0 do begin <br/> Inc (task); <br/> fillchar (next, sizeof (next), 0 ); <br/> fillchar (word, sizeof (Word), 0); <br/> fillchar (C, sizeof (C), 0); <br/> T: = 1; <br/> for I: = 1 to n do begin <br/> readln (ST); New; <br/> end; <br/> C []: = 1; C []: = 1; C []: = 1; C []: = 1; C []: = 1; <br/> BFS; <br/> readln (ST); Len: = length (ST); <br/> filldword (F, sizeof (f) Div 4, OO); <br/> F []: = 0; <br/> for I: = 0 to len-1 DO <br/> for J: = 1 to t do if f [I, j] <oo then begin <br/> KK: = U [st [I + 1]; <br/> for K: = 1 to 4 Do begin <br/> L: = J; <br/> while C [L, K] = 0 do l: = next [l]; <br/> L: = C [L, K]; OK: = word [l]; R: = L; <br/> while r> 1 do begin <br/> r: = next [R]; OK: = OK or word [R]; <br/> If OK then break; <br/> end; <br/> if not OK and (F [I, j] + ord (k <> KK) <F [I + 1, l]) Then <br/> F [I + 1, l]: = f [I, j] + ord (k <> KK ); <br/> end; <br/> ans: = Oo; <br/> for I: = 1 to t do if not word [I] and (F [Len, I] <ans) Then <br/> ans: = f [Len, I]; <br/> If ans = oo then ans: =-1; <br/> writeln ('case', task, ':', ANS ); <br/> readln (n); <br/> end; <br/> close (input); <br/> close (output); <br/> end. <br/>

 

Pku2778

I want to give you several words and find a DNA string with a length of N (containing only 'a 'C' G' t ') number of strings excluding any word (mod 100000) (n <= 200000000)

Analysis and algorithm:

Similar to 2778, the problem solving direction of DP is not wrong.

In this question, the astronomical range of n seems to be telling us that this question is a matrix accelerated DP question, so I will only talk about the construction of the transfer matrix, for more information about matrix optimization, see other online documents.

For node I in the letter tree, enumerate what is the next character. When node J is reached, as long as no word matches successfully (or where you should pay attention, scan it on the next link) you can transfer Inc (G [I, j ]).

Then, the power is saved. ans = sigma (F [1, I]) mod Mo (word [I] = false) (my root = 1)

Code:

Program syj; {DP + ac_automation} <br/> const <br/> maxt = 100; <br/> mo = 100000; <br/> type <br/> arr = array [1 .. maxt, 1 .. maxt] of int64; <br/> var M, T, I, J, K, L: longint; <br/> U: array ['A '.. 'Z'] of longint; <br/> ST: ansistring; <br/> OK: Boolean; <br/> N, Ans: int64; <br/> word: array [0 .. maxt] of Boolean; <br/> next, Q: array [0 .. maxt] of longint; <br/> F, G: arr; <br/> C: array [0 .. maxt, 1 .. 4] of longint; <br/> procedure Init; <br/> var I, j: longint; <br/> begin <br/> readln (ST); j: = 1; <br/> for I: = 1 to length (ST) DO <br/> If C [J, U [st [I]> 0 then J: = C [J, U [st [I] <br/> else begin <br/> Inc (t); C [J, U [st [I]: = T; J: = T; <br/> end; <br/> word [J]: = true; <br/> end; <br/> procedure BFS; <br/> var St, Ed, I, J, K: longint; <br/> begin <br/> ST: = 0; ED: = 1; Q [1]: = 1; <br/> while st <Ed do begin <br/> Inc (ST); I: = Q [st]; <br/> for J: = 1 to 4 do if C [I, j]> 0 then begin <br/> Inc (ed); Q [ed]: = C [I, j]; K: = next [I]; <br/> while C [K, J] = 0 do K: = next [k]; <br/> next [C [I, j]: = C [K, J]; <br/> end; <br/> procedure Cheng (, b: arr; var C: ARR); <br/> var I, j, k: longint; <br/> TMP: int64; <br/> begin <br/> for I: = 1 to t do <br/> for J: = 1 to t do begin <br/> TMP: = 0; <br/> for K: = 1 to t do TMP: = TMP + A [I, K] * B [K, J]; <br/> C [I, j]: = TMP mod Mo; <br/> end; <br/> procedure calc (I: longint ); <br/> begin <br/> If I = 1 then F: = G <br/> else begin <br/> calc (I Div 2 ); <br/> Cheng (F, F, F); <br/> If Odd (I) Then Cheng (F, G, F); <br/> end; <br/> end; <br/> begin <br/> assign (input, 'pku2778. in '); reset (input); <br/> assign (output, 'pku2778. out'); rewrite (output); <br/> U ['a']: = 1; U ['C']: = 2; U ['G']: = 3; U ['T']: = 4; <br/> C []: = 1; C []: = 1; C []: = 1; C [0, 4]: = 1; <br/> readln (m, n); T: = 1; <br/> for I: = 1 to M do Init; <br/> BFS; <br/> for I: = 1 to t do <br/> for J: = 1 to 4 Do begin <br/> K: = I; <br/> while C [K, J] = 0 do K: = next [k]; <br/> K: = C [K, J]; L: = K; OK: = false; <br/> while l> 1 do begin <br/> OK: = OK or word [l]; <br/> L: = next [l]; <br/> end; <br/> if not OK then Inc (G [I, k]); <br/> end; <br/> calc (n); <br/> ans: = 0; <br/> for I: = 1 to t do <br/> if not word [I] Then Inc (ANS, F [1, I]); <br/> writeln (ANS mod Mo ); <br/> close (input); <br/> close (output); <br/> end. <br/>

This question is quite smooth. Once an AC, I feel like it is getting better and better.

 

Now, the AC automatic mechanism should be almost like this. It is found that the automatic mechanism is mainly used to optimize the State representation in the DP for multi-string pattern matching, and you have to think about it in the future. Hope that the AC automatic machine will bring us more ac joy!

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.