Codeforces Round #246 (Div. 2) D. Prefixes and Suffixes (suffix array orKMP)

Last Update:2014-05-19 Source: Internet

Author: User

Tags acos

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

D. Prefixes and Suffixestime limit per test1 secondmemory limit per test256 megabytesinputstandard inputoutputstandard output

You have a stringS? =?S_{1S_{2...S_{|S|, Where |S| Is the length of stringS, AndS_{IItsI-Th character.}}}}

Let's introduce several definitions:

A substringS[I..J] (1? ≤?I? ≤?J? ≤? |S|) Of stringSIs stringS_{IS_{I? +? 1...S_J.}}
The prefix of stringSOf lengthL(1? ≤?L? ≤? |S|) Is stringS[1 ..L].

The suffix of stringSOf lengthL(1? ≤?L? ≤? |S|) Is stringS[|S|? -?L? +? 1 .. |S|].

Your task is, for any prefix of stringSWhich matches a suffix of stringS, Print the number of times it occurs in stringSAs a substring.

Input

The single line contains a sequence of charactersS_{1S_{2...S_{|S| (1? ≤? |S|? ≤? 10^{5)-stringS. The string only consists of uppercase English letters.}}}}

Output

In the first line, print integerK(0? ≤?K? ≤? |S|)-The number of prefixes that match a suffix of stringS. Next printKLines, in each line print two integersL_{I C_{I. NumbersL_{I C_{IMean that the prefix of the lengthL_{IMatches the suffix of lengthL_{IAnd occurs in stringSAs a substringC_{ITimes. Print pairsL_{I C_{IIn the order of increasingL_I.}}}}}}}}}

Sample test (s) input

ABACABA

Output

31 43 27 1

Input

AAA

Output

31 32 23 1

Question:

A string of no more than 10 ^ 5 characters. You need to output the length of the prefix that exactly matches the suffix by length. And the number of times that the prefix appears in the entire string. (Overlapping)

Ideas:

The prefix and suffix are displayed at the competition. In my heart, I was overjoyed. Haha. I have learned about Suffix Arrays. This is just a good place. After thinking about it, the algorithm has been formed. The suffix 0 is the entire string. There must be a prefix that matches the common prefix with the entire suffix. Then determine the number of occurrences. When you know that a suffix is a target suffix. You can know its rank value. Then, the suffix that must completely contain a suffix must be followed by it. Based on ranking rules. You want. If the prefix of suffix a contains suffix B. Will a be in front of B? The front is obviously short. So the rest of the work is to determine the maximum distance that can be expanded down. This can be determined based on the value of the height data. Binary + rmq is required. The location is determined by two points. Rmq determines whether the conditions are met. Although the train of thought is correct, it has always been wrong until the end of the game. Only later debugging will I know whether I have a deep understanding of the suffix array. The problem is why the multiplication algorithm requires that txt [n-1] = 0. j = sa [rank [I]-1]; rank [I] = 0. We can solve this problem by adding 0 to the end of the original string. For details, see the code:

# Include
   
    
Using namespace std; const int INF = 0x3f3f3f3f; const double eps = 1e-8; const double PI = acos (-1.0); const int maxn = 150010; char txt [maxn]; int sa [maxn], T1 [maxn], T2 [maxn], ct [maxn], he [maxn], rk [maxn], ans, n, m; // sa [I] indicates the starting position of the suffix of the ranking I. Int rmq [25] [maxn], lg [maxn], ansn [maxn], ansp [maxn], ptr; void getsa (char * st) // note that m is an ASCII code in the range of {int I, k, p, * x = T1, * y = T2; for (I = 0; I
    
     
= 0; I --) // inverted enumeration ensures the relative sequence of sa [-- ct [x [I] = I; for (k = 1, p = 1; p
     
      
= K) y [p ++] = sa [I]-k; // sort by the second keyword. y [I] indicates the start position of the suffix of the second keyword ranking I for (I = 0; I
      
       
= 0; I --) sa [-- ct [x [y [I] = y [I]; // sort by the first keyword for (swap (x, y), p = 1, x [sa [0] = 0, I = 1; I
       
        
> 1] + 1;} void solve () {int low, hi, mid, p, pos, a, B, ans, tp, I; getsa (txt ), gethe (txt), rmq_init (); ptr = 0, pos = rk [0]; for (I = n-2; I> 0; I --) {if (rk [I]
        
          = N-i-1) {ansp [ptr] = p, tp = rk [I] + 1; low = rk [I] + 1, hi = n-1, ans =-1; while (low <= hi) {mid = (low + hi)> 1; if (rmq_min (tp, mid)> = p) ans = mid, low = mid + 1; else hi = mid-1;} ansn [ptr ++] = ans-rk [I] + 1 ;}} int main () {int I; prermq (); while (~ Scanf ("% s", txt) {m = 150, n = strlen (txt); n ++; solve (); ansp [ptr] = n-1; ansn [ptr ++] = 1; printf ("% d \ n", ptr); for (I = 0; I
         
          
The next step is the second approach. After the competition. The first idea cannot be adjusted. So I went to the group and asked. The result was despised by qijie. If you throw a sentence kmp, you will leave. Think about it. My IQ is deeply despised. Kmp can easily calculate the number of times each prefix appears in the original string. The specific method is to find a mismatch array for the original string. Then match with yourself. If the position I matches the position j, the prefix j appears at the position I. We use cnt [I] to record. The number of times that prefix I appears. Finally, Count cnt [next [I] + = cnt [I]. This is easy to understand. If the prefix j can appear at the position I, next [j] will certainly appear at the position I. Count the number of times each prefix appears in the original string. Now we need to find the number of prefix matching the money fix and suffix. This is simple. Do you match yourself with yourself by matching your first half and your other parts. So we only need to match the n + 1 position to find all the prefixes that match the suffix. The gorgeous O (n) is gone ....
          
          For details, see the code:
          
          # Include
           
            
Using namespace std; const int INF = 0x3f3f3f3f; const double eps = 1e-8; const double PI = acos (-1.0); const int maxn = 150010; char txt [maxn]; int f [maxn], cnt [maxn], ansp [maxn], ansn [maxn], ct, n; void getf (char * p) {int I, j; f [0] = f [1] = 0; for (I = 1; I
            
             
0; j --) // Why can this be done. The strings with different endpoints must be different. Kmp ensures different endpoints. If (f [j]) // f [j] indicates the position of the next comparison. Note that f [j]-1 must be the same. Cnt [f [j]-1] + = cnt [J-1]; while (t) // prefix matching suffix {ansp [ct] = t; ansn [ct ++] = cnt [T-1]; t = f [t];} printf ("% d \ n", ct); for (I = CT-1; i> = 0; I --) printf ("% d \ n", ansp [I], ansn [I]);} int main () {while (~ Scanf ("% s", txt) {n = strlen (txt); memset (cnt, 0, sizeof cnt); getf (txt); KMP ();}}

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More