Suffix family known Memberssuffix Treesuffix ArrayAutomatic suffix machinesuffix Cactussuffix prophecysuffix splay? The suffix tree is the ancestor of the suffix array and the suffix automaton? The function is still relatively powerful, it is useful in palindrome string or dictionary order. And now there's a linear way of doing it. (But in fact I didn't use the tree after that.) The following is a comparison of the suffix automata and suffix arrays
- Single string problem The unequal number is "better than",&& is almost (the following is a personal feeling)
- 1 Repeating substrings
- 1, 1 can cross longest repeat substring &NB Sp suffix automaton >= suffix array all are base But the former code is slightly shorter
- 1, 2 cannot cross the longest repeating substring nbsp Suffix array >= suffix automaton The former is easy to judge Cross; the latter needs to record every Status all occurrences
- 1, 3 cross K-times longest repeat substring suffix automaton ; = suffix array The former requires + two points; the latter does not need to be judged, directly topological out of each state The number of times
- 2 Sub-string number problem
- 2,1 not identical substring suffix automata && suffix arrays are basic functions and are easy to implement.
- 3 Cyclic substring problems
- 3,1 the minimum loop-after-suffix array suffixes automaton should not work.
- 3,2 consecutive repeating substring suffix array with the largest number of repetitions
- two character string problem
- 1 common substring problem
- 1, 1 longest common string & nbsp suffix automaton && suffix array &NBSP ; All are basic functions
- 2 substring number problem
li>2,1 Common substring of a specific length after Prefix automata && suffix arrays two basic features
- Multiple string issues
- 1 Common substring problems
- The oldest string that appears in K-strings generalized suffix automaton >= suffix array (KMP can also find the longest common string of multiple strings ) (specific efficiency who high depends on data)
- The oldest string generalized suffix automaton >= suffix array with k-times appearing in each of the strings
- 1,3 the oldest string generalized suffix automaton that appears in each string or after inversion? suffix array
- Minimal notation: suffix automata
- Minimum loop: suffix array
Personal feeling:
A single string and two strings of questions, basically with a suffix array or suffix automata can be achieved. Multi-string problem with generalized suffix automata is also very strong, there is humorous if you want to use a suffix array, you must use RMQ (tree-like array | | ST) + two points, even to use splay to solve. Of course, flexible use of the suffix array plus a variety of tools to solve problems, to deal with a variety of difficulties, after all, suffix automata is also limited. Individuals are more inclined to write suffix automata, feel good to achieve a bit, the code is also good-looking.
The following compares the processing of multiple string strings
The problem of generalized suffix automata:
POJ3294: Test Instructions : Given some template strings, find the longest common string, the longest common string that appears in at least half of the strings.
comparison: If it is an array of suffixes, then the +rmq of the two points, and the generalized suffix automata only need to record the location of the occurrence, the last pass can be.
#include <iostream>#include<cstdio>#include<algorithm>#include<cstring>#include<memory>#include<cmath>#defineMAXN 350003using namespacestd;intN,len,ans,max,now;Chars[1010],cap[1010];structsam{intch[maxn][ -],fa[maxn],maxlen[maxn],last,sz; intROOT,NXT[MAXN],SIZE[MAXN]; voidinit () {sz=0; Root=++sz; memset (Size,0,sizeof(size)); memset (ch[1],0,sizeof(ch[1])); memset (NXT,0,sizeof(NXT)); } voidAddintx) {intNp=++sz,p=last; last=NP; memset (CH[NP],0,sizeof(CH[NP])); MAXLEN[NP]=maxlen[p]+1; while(P&&!ch[p][x]) ch[p][x]=np,p=Fa[p]; if(!p) fa[np]=1; Else { intq=Ch[p][x]; if(maxlen[p]+1==MAXLEN[Q]) fa[np]=Q; Else { intnq=++sz; memcpy (Ch[nq],ch[q],sizeof(Ch[q])); SIZE[NQ]=SIZE[Q]; nxt[nq]=Nxt[q]; MAXLEN[NQ]=maxlen[p]+1; FA[NQ]=Fa[q]; FA[Q]=fa[np]=NQ; while(p&&ch[p][x]==q) ch[p][x]=nq,p=Fa[p]; } } for(; np;np=FA[NP])if(nxt[np]!=Now ) {SIZE[NP]++; NXT[NP]=Now ; }Else Break; } voidDfsintXintD) {//Output if(D!=maxlen[x] | | d>ans)return; if(Maxlen[x]==ans && size[x]>n) {puts (CAP);return; } for(intI=0;i< -;++i)if(Ch[x][i]) {cap[d]=i+'a'; DFS (ch[x][i],d+1); cap[d]=0; } }}; Sam Sam;intMain () { while(~SCANF ("%d", &n) &&N) {Sam.init (); for(intI=1; i<=n;i++) {scanf ("%s", s+1); Sam.last=Sam.root; Len=strlen (s+1); now=i; for(intj=1; j<=len;j++) Sam.add (s[j]-'a'); } Max=0; ans=0; n>>=1; for(intI=1; i<=sam.sz;i++) if(Sam.size[i]>n&&sam.maxlen[i]>ans) {max=i;ans=sam.maxlen[i];} if(ANS) Sam.dfs (1,0); ElsePuts"?"); Puts (""); } return 0;}
View Code
SPOJ8093 Test Instructions : Given some template strings, ask how many template strings each match string appears in.
Comparison: Ibid. Two ways of passing: one character at a time, or you can use Bitset to record where it occurred until all the strings have been added, then the topological sort, and then "or" to pass up.
#include <iostream>#include<cstdio>#include<algorithm>#include<cstring>#include<cmath>#defineN 200003using namespacestd;intch[n][ -],fa[n],l[n],n,m,len;intR[n],v[n],cnt,np,p,nq,q,last,root,nxt[n],now,size[n];CharS[n];voidExtendintx) { intc=s[x]-'a'; P=last; np=++cnt; last=NP; L[NP]=l[p]+1; for(;p &&!ch[p][c];p =fa[p]) ch[p][c]=NP; if(!p) fa[np]=Root; Else{Q=Ch[p][c]; if(l[q]==l[p]+1) fa[np]=Q; Else{NQ=++cnt; l[nq]=l[p]+1; memcpy (Ch[nq],ch[q],sizeofCH[NQ]); SIZE[NQ]=SIZE[Q]; nxt[nq]=Nxt[q]; FA[NQ]=Fa[q]; FA[Q]=fa[np]=NQ; for(; ch[p][c]==q;p=fa[p]) ch[p][c]=NQ; } } for(; np;np=FA[NP])if(nxt[np]!=Now ) {SIZE[NP]++; NXT[NP]=Now ; } Else Break;}intMain () {scanf ("%d%d",&n,&m); Root=++CNT; for(intI=1; i<=n;i++) {scanf ("%s", s+1); Last=Root; Len=strlen (s+1); now=i; for(intj=1; j<=len;j++) Extend (j); } for(intI=1; i<=m;i++) {scanf ("%s", s+1); Len=strlen (s+1); P=Root; for(intj=1; j<=len;j++) p=ch[p][s[j]-'a']; printf ("%d\n", Size[p]); }}
View Code
(for the suffix array, the next is not very sensitive, do more to add some later to come up)
By the way, two suffixes of the automatic machine diagram
Status |
Sub-string |
Endpos |
S |
Empty string |
{0,1,2,3,4,5,6} |
1 |
A |
{1,2,5} |
2 |
Aa |
{2} |
3 |
AaB |
{3} |
4 |
Aabb,abb,bb |
{4} |
5 |
B |
{3,4,6} |
6 |
Aabba,abba,bba,ba |
{5} |
7 |
Aabbab,abbab,bbab,bab |
{6} |
8 |
Ab |
{3,6} |
9 |
Aabbabd,abbabd,bbabd,babd,abd,bd,d |
{7}
|
How to choose suffix array && suffix automata