4
Tips on how to solve problems:
Little ho: How to solve this problem?
Little hi: Well, this time the problem is the repetition of the most consecutive string.
Little ho: It doesn't seem to be a good idea.
Little hi: Let's consider how to solve how to find the maximum number of repetitions of a string before we reduce the difficulty.
Little ho: Uh. I think, for example, string Abababab, can be (1,8), also can be (2,4), the largest is (4,2).
Little hi: Yes. If we enumerate a possible length of the cyclic section L (or k), can we quickly determine if the L is legal?
Little ho: Ah! I think ... It seems that the original string and the original string are removed before the two strings of LCP (the longest common prefix), if it can be fully matched on, it is satisfied!
Little hi: Yes, that's right. For example Abababab, test whether is (2,4), take Abababab and Ababab to seek LCP.
Little hi: It is worth mentioning that using the height array can quickly find the LCP we need. For example, the height array for abababab is as follows:
suffix |
SA |
Height |
Ab |
7 |
0 |
Abab |
5 |
2 |
Ababab |
3 |
4 |
Abababab |
1 |
6 |
B |
8 |
0 |
Bab |
6 |
1 |
Babab |
4 |
3 |
Bababab |
2 |
5 |
Little hi: If we ask for a two-suffix LCP, only the minimum value of the middle of the height array is required. For example, Abababab and Ababab LCP is [4] the minimum value, namely, 2;bab and Bababab LCP is [3, 5] This section of the minimum value, that is, 3;ab and Babab LCP is [2, 4, 6, 0, 1, 3] The minimum value, that is, 0.
Small hi: This problem of finding a certain section of the height array is exactly the [RMQ problem] previously mentioned, which can be processed by O (the Nlogn) to O (1) to handle a single inquiry; Of course, the use of data structures such as segment trees is also possible, with a single query O (LOGN).
Little ho: I get it. Back to the original question, we must first enumerate (k,l) The L, then enumerate the starting position I, calculate suffix (i) and suffix (i+l) LCP, recorded as LCP (L, i), then K (l, i) is equal to LCP (l,i)/L + 1. The maximum K (l, i) is the answer for all the length of the Loop section L and the starting position I.
Little hi: you're right! But there is still room for further optimization. For OK l, we do not enumerate all of the starting position I, and only enumerate I is an integer multiple of L. If the starting position of the optimal string is exactly in multiples of L, then the largest k we find is the correct answer.
Little ho: That's the truth. But what if the starting position of the optimal string is not in multiples of l?
Small hi: Even if not, the problem will be too bad, if the optimal string position in X, we can imagine that we will enumerate to a nearest position after X P,p is a multiple of L. And we calculated the LCP,LCP (L, p) of suffix (p) and suffix (p+l) so that at this point K (L, p) =LCP (L, p)/l+1.
Little hi: For the K (l, P-1), K (L, p-2) ... K (l, p-l+1), which are skipped by us, the upper limit is K (l, p) +1.
Little ho: That's right. Because their starting position distance p does not exceed L, it is more than suffix (p) to add a follow-up link.
Small hi: Second, if K (l, P-1), K (L, p-2) ... K (l, P-l+1) has one of the values K (L, p) + 1, then K (l, P-l + LCP (L, p) mod l) must be equal to K (L, p) +1. (MoD is to take the remainder operation)
Little ho: Why?
Small hi: For example, String XAYCDABCDABCD (xy each represents an indeterminate character, the specific character will affect the final answer, we will analyze it later), when we consider l=4, the first time to enumerate the starting position of p=4, The CDABCDABCD and CDABCD LCP (4, 4) =6,k (4, 4) = 2 are calculated. According to the above assertion, only when K (L, P-l + LCP (L, p) mod l) =k (4, 4-4 + 6 mod 4) =k (4, 2) =3, K (4, 1), K (4, 2) and K (4, 3) will have 3. First we can judge that K (4, 3) must not be equal to 3, because regardless of which character y is, YCDABCDABCD and BCDABCD LCP (4, 3) The maximum is 7, less than 8. Second, if K (4, 2) ≠3, then K (4, 1) will not be. Because if K (4, 2) ≠3, stating that Ay and AB do not match, then regardless of which character X is, Xay and dabs do not match, LCP (4, 1) < L,k (4, 1) = 1.
Little ho: Oh, I sort of understand. K (L, P-l + LCP (L, p) mod l) is a dividing line, the value on the right because LCP is not large enough, must not add a follow-up link. And if K (l, P-l + LCP (L, p) mod L) does not add a cyclic section, it means [P-l + LCP (L, p) mod L, p] This intermediate match error, the left LCP also follow the Avalanche, it is not possible to increase the cycle section.
Little hi: Yes!
Small ho: What is the time complexity of enumerating L and starting the enumeration?
Little hi: You will find that the time complexity of enumerating the starting position of the enumeration after L is O (n/l), so the total complexity is O (N/1) +o (N/2) +o (N/3) ... This is a classic summation, the total complexity is O (NLOGN).
Little ho: I get it! So magical, seemingly simple ideas, but also very low complexity.
Little hi: Yes. The following is a binary judgment of C + + code implementation:
for (l=1; L <= N; l++) {for (int i = 1; i + l <= n; i + = L) { int R = LCP (i, i + L); ans = max (ans, r/l + 1); if (i >= l-r% l) { ans = max (LCP (i-l + r%l, i + r%l)/L + 1, ans); } }
Little ho: OK. I'm going to make it.
#include <iostream>#include<cstring>#include<cstdio>#include<algorithm>#include<cmath>#include<string>#include<map>#include<stack>#include<queue>#include<vector>#defineINF 2e9#defineMet (b) memset (a,b,sizeof a)typedefLong Longll;using namespacestd;Const intN = 2e5+5;Const intM = 4e5+5;intcmpint*r,intAintBintl) { return(R[a]==r[b]) && (r[a+l]==r[b+l]);}intWa[n],wb[n],wss[n],wv[n];intRank[n];//The rank of suffix i in sa[]intHeight[n];//Sa[i] and Sa[i-1] LCPintSa[n];//Sa[i] Indicates the subscript for the small suffix of the rank ivoidDA (int*r,int*sa,intNintM//here n is more than 1 of the input n, a manually added character used to avoid CMP time out of bounds{ inti,j,p,*x=wa,*y=wb,*T; for(i=0; i<m; i++) wss[i]=0; for(i=0; i<n; i++) wss[x[i]=r[i]]++; for(i=1; i<m; i++) wss[i]+=wss[i-1]; for(i=n-1; i>=0; i--) sa[--wss[x[i]]]=i;//preprocessing length is 1 for(j=1, p=1; p<n; j*=2, m=p)//The SA that has been calculated for the length of J, to find 2*j SAS { for(p=0, I=n-j; i<n; i++) y[p++]=i;//Special handling without a second keyword for(i=0; i<n; i++)if(sa[i]>=j) Y[p++]=sa[i]-j;//using the length J, sort by the second keyword for(i=0; i<n; i++) wv[i]=X[y[i]]; for(i=0; i<m; i++) wss[i]=0; for(i=0; i<n; i++) wss[wv[i]]++; for(i=1; i<m; i++) wss[i]+=wss[i-1]; for(i=n-1; i>=0; i--) Sa[--wss[wv[i]]]=y[i];//Base Sort Section for(t=x,x=y,y=t,p=1, x[sa[0]]=0, i=1; i<n; i++) X[sa[i]]=CMP (y,sa[i-1],sa[i],j)? p1:p + +;//update rank array x[], pay attention to the same }}voidCalheight (int*r,intN//here n is the actual length{ inti,j,k=0;//the legal range of height[] is 1-n, where 0 is the end-added character for(i=1; i<=n; i++) rank[sa[i]]=i;//rank according to SA for(i=0; i<n; height[rank[i++]] = k)//definition: h[i] = height[Rank[i]] for(k?k--:0, j=sa[rank[i]-1]; R[I+K]==R[J+K]; k++);//optimize the calculation height process according to H[i] >= h[i-1]-1}intN;CharSs[n];intAa[n];Const intmaxn=N;intmn[n][ -];intLog22[n];voidPre () { for(intI=1; i<=n; i++) Log22[i]=log2 (i);}voidRmq_init (intNint*h) { for(intj=1; j<=n; J + +) mn[j][0]=H[j]; intm=Log22[n]; for(intI=1; i<=m; i++) for(intJ=n; J>0; j--) {Mn[j][i]=mn[j][i-1]; if(J+ (1<< (I-1)) <=n) Mn[j][i]=min (Mn[j][i], mn[j+ (1<< (I-1)] [i1]); }}intLcp_min (intLintR//request LCP (L,R){ if(l>r) swap (L,R);//First Exchangel++;//According to the height definition, l++ intm=log22[r-l+1]; returnMin (mn[l][m],mn[r-(1<<M) +1][m]);}intsolve () {intans=1; for(intL=1; l<=n; l++) { for(intj=0; j<n; j+=L) {intLcp_len=lcp_min (rank[j],rank[j+L]); Ans=max (ans,lcp_len/l+1); intlast_possible_pos=j-(l-lcp_len%m); if(last_possible_pos>=0) ans=max (ans,1+lcp_min (Rank[last_possible_pos], rank[last_possible_pos+l])/L); } } returnans;}intMain () {scanf ("%s",&SS); N=strlen (ss); for(intI=0; i<n; i++) aa[i]=ss[i]-'a'+1; Aa[n]=0; DA (Aa,sa,n+1, -); Calheight (Aa,n); Pre (); Rmq_init (N,height); intans=solve (); printf ("%d\n", ans); return 0;}