5
Tips on how to solve problems
Little hi: This week's topic is actually given a string s, asking for the number of all the different substrings of S. Little Ho, do you know how to solve it quickly?
Little ho: We're talking about suffix automata recently, so it's definitely about the suffix automaton! Based on the basic concept and nature of Sam's study last week, Sam's each State St contains a subset of S's substrings, recorded as substrings (ST), and (1) for two different states U and V, including substrings substrings (U) ∩substrings (v) =?; (2) Each substring is contained in exactly one state. So all we have to do is construct the SAM for S and then the sum of all the states St (MaxLen (ST)-minlen (ST)) is the number of substrings.
Little hi: Yes. Last week we mentioned that Sam has an O (length (S)) constructor. This week we'll talk about how to construct.
Little hi: First, in order to construct O (length (S)), we cannot save too much data for each state. For example, substring (ST) must not be preserved. For the status St we only save the following data:
Data |
meaning |
MAXLEN[ST] |
St contains the length of the oldest string |
MINLEN[ST] |
The length of the shortest string that the St contains |
Trans[st][] |
The transfer function of St |
SLINK[ST] |
ST's Suffix Link |
Little hi: Secondly, we construct the SAM corresponding to s with the increment method. We start from the initial state, add one character at a time s[1], s[2], ... S[n], in turn constructs can recognize s[1], s[1..2], s[1..3], ... S[1..n]=s's Sam.
Little hi: Suppose we've constructed the Sam of s[1..i]. At this point we want to add the character s[i+1], so we added i+1 s[i+1] suffix to identify: s[1..i+1], s[2..i+1], ... S[i. I+1], s[i+1]. Considering that these additions are from s[1..i], s[2..i], s[3..i], ..., s[i], "" (empty string) transferred over by character S[i+1], so we also have to s[1..i], s[2..i], s[3..i], ..., s[i], "" The corresponding state of the (empty string) increases the corresponding transfer.
Little hi: We assume that s[1..i] corresponds to the state of U, which is equivalent to S[1..i]∈substrings (U). According to last week's discussion we know S[1..I], s[2..i], s[3..i], ..., s[i], "" "(empty string) the corresponding state happens to be from u to the initial state S by the suffix link link up the path of all States, may call this path (all the state set) is Suffix-path (u->s).
Little hi: Obviously at least s[1..i+1] This substring cannot be recognized by the previous Sam, so we need to add at least one state z,z that contains at least s[1..i+1] this substring.
Little hi: First consider one of the simplest cases: for Suffix-path (u->s) of any State V, there are trans[v][s[i+1]]=null. At this time we just make trans[v][s[i+1]]=z, and make slink[st]=s can.
Little hi: For example we have got Sam for "AA" and now want to construct "AaB" Sam. As shown in the following:
Small hi: At this time U=2,z=3,suffix-path (u->s) is the orange state of the path 2-1-s. And none of the 3 states has a transfer of the corresponding character B. So we just add the red transfer trans[2][b]=trans[1][b]=trans[s][b]=z. And of course, don't forget the slink[3]=s.
Little ho: What if there is a node V on Suffix-path (u->s), so trans[v][s[i+1]]!=null?
Little hi: Good question. For example, let's say we've constructed Sam for "Aabb", and now we're going to add a "Aabba" Sam with a character a construct.
Small hi: Then U=4,z=6,suffix-path (u->s) is the orange state of the path 4-5-s. For status 4 and state 5, since they do not have a transfer of the corresponding character a, we just add the red transfer trans[4][a]=trans[5][a]=z=6. Facing S when we met the small ho you raised the question, trans[s][a]=1 already exist, how to do?
Little ho: What do we do?
Little hi: Without losing generality, we can assume that the first State v encountered in Suffix-path (U->s) satisfies the trans[v][s[i+1]]=x. At this point we need to discuss the case of the substring contained in X. If the oldest string contained in X is the oldest string contained in V S[i+1], equivalent to MaxLen (v) +1=maxlen (x), as in the example above, V=s, X=1,longest (v) is an empty string, longest (1) = "a" is longest (v) + ' a '. This is a relatively simple situation, we just need to add slink[z]=x.
Small hi: If the oldest string contained in X is not the oldest string contained in V S[i+1], is equivalent to MaxLen (v) +1 < MaxLen (x), which is the most complex, without losing its generality, we use to represent this case, when the added character is C, the state is Z.
Small hi: On Suffix-path (u->s) This path, starting from U There is a part of the continuous state to meet the trans[u ...] [C]=null, for this part of the state we only need to add Trans[u ...] [C]=z. Shortly thereafter there is a part of the continuous State v. W Meet Trans[v. W][c]=x, and Longest (v) +c is not equal to longest (x). At this point we need to split the new state y from the X, and the original x length is less than equal to longest (v) +c sub-string to Y, the remaining strings are left for X. At the same time make trans[v. W][c]=y,slink[y]=slink[x], slink[x]=slink[z]=y.
Little ho: It seems more complicated.
Little hi: Let's give an example. Assuming we have already constructed the "AaB" Sam, now we are going to add a character B to construct the "Aabb" Sam.
Little hi: When we deal with the state S on Suffix-path (u->s), we encounter trans[s][b]=3. and longest (3) = "AaB", Longest (S) + ' B ' = "B", the two are not equal. In fact, not equal means that the addition of new characters specifier Endpos ("AaB") is not equal to Endpos ("B"), it is inevitable that these two substrings can not belong to a State 3. At this point we are going to split a state 5 from 3, the "B" and its suffix to 5, the remaining substrings are left to 3. At the same time make trans[s][c]=5, Slink[5]=slink[3]=s, slink[3]=slink[6]=5.
Small hi: The code for the entire process is as follows, where state 0 represents the initial state s; the meaning of the state U, V, x, Y, Z is as described above;-1 means slink or trans does not exist.
const int MAXL = 1000000;string S;int n = 0, len, st;int maxlen[2 * MAXL + ten], minlen[2 * MAXL +], trans[2 * MAXL + 10 ][26], slink[2 * MAXL + 10];int new_state (int _maxlen, int _minlen, int* _trans, int _slink) {Maxlen[n] = _maxlen; Minlen[n] = _minlen; for (int i = 0; i < i++) {if (_trans = = NULL) Trans[n][i] = 1; else trans[n][i] = _trans[i]; } Slink[n] = _slink; return n++;} int Add_char (char ch, int u) {int c = ch-' a '; int z = new_state (Maxlen[u] + 1,-1, NULL,-1); int v = u; while (v! =-1 && trans[v][c] = = 1) {Trans[v][c] = Z; v = slink[v]; } if (v = =-1) {//The simplest case, Suffix-path (u->s) does not have the corresponding character ch transfer minlen[z] = 1; Slink[z] = 0; return z; } int x = Trans[v][c]; if (Maxlen[v] + 1 = = Maxlen[x]) {//simpler case, no splittingX minlen[z] = maxlen[x] + 1; SLINK[Z] = x; return z; } int y = New_state (Maxlen[v] + 1,-1, trans[x], slink[x]); The most complex case, split x slink[y] = slink[x]; MINLEN[X] = Maxlen[y] + 1; Slink[x] = y; MINLEN[Z] = Maxlen[y] + 1; Slink[z] = y; int w = v; while (w! =-1 && trans[w][c] = = x) {Trans[w][c] = y; W = slink[w]; } Minlen[y] = Maxlen[slink[y]] + 1; return z;}
Little ho: eh? The program is surprisingly simple.
#include <iostream>#include<cstring>#include<cstdio>#include<cstring>#include<algorithm>#include<cmath>#include<time.h>#include<string>#include<map>#include<stack>#include<vector>#include<Set>#include<queue>using namespaceStd;typedefLong Longll;Const intinf=0x3f3f3f3f;Const intn=1e6+ -;Const intm=1e6+5;inttot,slink[2*n],trans[2*n][ -],minlen[2*n],maxlen[2*N];CharStr[n];intN;intNewState (int_maxlen,int_minlen,int* _trans,int_slink) {maxlen[++tot]=_maxlen; Minlen[tot]=_minlen; Slink[tot]=_slink; if(_trans) for(intI=0; i< -; i++) Trans[tot][i]=_trans[i]; returntot;}intAdd_char (CharChintu) {intc=ch-'a', v=u; intZ=newstate (maxlen[u]+1,-1Null0); while(v&&!Trans[v][c]) {Trans[v][c]=Z; V=Slink[v]; } if(!v) {Minlen[z]=1; SLINK[Z]=1; returnZ; } intx=Trans[v][c]; if(maxlen[v]+1==Maxlen[x]) {Slink[z]=x; MINLEN[Z]=maxlen[x]+1; returnZ; } intY=newstate (maxlen[v]+1,-1, trans[x],slink[x]); SLINK[Z]=slink[x]=y; MINLEN[X]=minlen[z]=maxlen[y]+1; while(v&&trans[v][c]==x) {Trans[v][c]=y; V=Slink[v]; } Minlen[y]=maxlen[slink[y]]+1; returnZ;}intMain () {scanf ("%s", str); intLen=strlen (str), pre=1; Tot=1; for(intI=0; i<len; i++) {Pre=Add_char (Str[i],pre); } Long Longans=0; for(intI=2; i<=tot; i++) {ans+=maxlen[i]-minlen[i]+1; } cout<<ans<<Endl; return 0;}