"POJ 2778" DNA Sequence

Source: Internet
Author: User

Description

It's well known that DNA Sequence are a Sequence only contains a, C, T and G, and it's very useful to analyze a segment of DNA Sequence,for example, if a animal ' s DNA Sequence contains segment ATC then it could mean that the animal could have a gene Tic disease. Until now scientists has found several those segments, the problem is how many kinds of DNA sequences of a species don ' t contain those segments.

Suppose that DNA sequences of a species are a sequence that consist of a, C, T and g,and the length of sequences are a given Integer n.

Input

First line contains (0 <= m <=), N (1 <= n <=2000000000). Here, M are the number of genetic disease segment, and n is the length of sequences.

Next m lines each line contain a DNA genetic disease segment, and length of these segments are not larger than 10.

Output

An integer, the number of DNA sequences, mod 100000.

Sample Input

4 3ATACAGAA

Sample Output

36

The main idea: given m strings, how many kinds of string length is n and does not contain this m string. (The string consists of a, C, G, T, M <= 10,n <= 2000000000, each substring length of not more than 10) analysis: The number of strings that do not contain substrings is required, it is obvious that AC automata + dynamic Programming (O (nm^2)) first, read into the substring Constructs a Trie tree, then builds the failure pointer (BFS) on the Trie tree, then runs on the Trie tree The Dynamic Plan, F[i][j] represents the length is I, the last character corresponds to the Trie number the program number of the J node, should notice that the leaf node cannot pass,    The sum of the scheme numbers of all the nodes that are finally drawn is the answer that is asked.    Matrix multiplication Fast Power (O (m^2 * logn)) because N is very large, so dynamic programming will certainly be T, but we study the data range will find that the degree of M and the string is very small, in fact, according to the worst case, the Trie number is as long as the array open to 100 is enough. Just 100? Even if 100 of the square is saved, 100 is too small!    Have to make good use of this little 100. 100 nodes, equivalent to 100 states, 100 transitions between states, you can enumerate the 4 nodes that each node points to, so that each time the transfer is as long as O (4).    But if you use a matrix to keep the connectivity between 22 (that is, whether it can be transferred), it takes O (100) time to transfer, so significantly slower, what is the use? Since it will certainly be useful to say, the original transfer equation is F[i + 1][trie[j].    TO[K]] + = F[i][j] (k is 0~3), but if so the transfer equation becomes F[i + 1][k] + = f[i][j] * Mat[j][k] (k for 0~100,mat indicates whether to connect).    If you look at it in a different way, you can become f[i][j] = SUM (f[i-1][k] * mat[k][j]), then it becomes a recursive type, and it is a recursive method that can be solved by matrix multiplication quickly. The rest of the work is simple, construct the mat matrix, do a quick power, and then multiply the last matrix on the good.

Code:

1#include <cstdio>2#include <cstring>3 structMatrix {4     inta[ the][ the];5 } mat, ti;6 intN, M, Len, Last, TN, ans, f[ the], t[ the][4], v[ the];7 Charstr[ One];8 voidBFS ()9 {Ten     intq[ the], HD, tl; One      for(Q[HD = TL =0] =0; HD <= tl; hd++) A          for(inti =0; I <4; i++) -T[q[hd]][i]? (Q[HD]? F[t[q[hd]][i]] = T[f[q[hd]]][i]:0), V[t[q[hd]][i]] |= V[f[t[q[hd]][i]], q[++tl] = T[q[hd]][i]: t[q[hd]][i] =T[f[q[hd]]][i]; - } the inline matrix times (matrix M1, matrix m2) - { - Matrix ret; -memset (RET.A,0,sizeof(RET.A)); +      for(inti =0; I <= TN; i++) -          for(intj =0; J <= TN; J + +) +              for(intK =0; K <= tn; RET.A[I][J]%=100000, k++) ARET.A[I][J] + = (Long Long) m1.a[i][k] * M2.a[k][j]%100000; at     returnret; - } -InlineintGC (Charch) - { -     returnch = ='A'?0: ch = ='C'?1: ch = ='T'?2:3; - } in intMain () - { toscanf ("%d%d", &n, &m); +      for(inti =0; I < n; i++) -     { thescanf ("%s", (Char*) &str); *Len =strlen (str); $Last =0;Panax Notoginseng          for(intj =0, ch = GC (Str[j]); J < Len; CH = GC (str[++j])) -last = T[last][ch]? T[LAST][CH]: t[last][ch] = + +tn; theV[last] =1; +     } A BFS (); the      for(inti =0; I <= TN; i++) +          for(intj =0; J <4; J + +) -(!v[t[i][j]]) &&!v[i]? mat.a[i][t[i][j]]++:0; $      for(inti =0; I <= TN; i++) Ti.a[i][i] =1; $      for(; m; m >>=1) -(M &1? Ti = Times (Ti, mat): TI), Mat =Times (Mat, mat); -      for(inti =0; I <= TN; i++) ans + = ti.a[0][i]; theprintf ("%d", ans%100000); -}

"POJ 2778" DNA Sequence

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.