Codeforces Round #246 (Div. 2) D. Prefixes and Suffixes (suffix array orKMP)

Source: Internet
Author: User
Tags acos

D. Prefixes and Suffixestime limit per test1 secondmemory limit per test256 megabytesinputstandard inputoutputstandard output

You have a stringS? =?S1S2...S|S|, Where |S| Is the length of stringS, AndSIItsI-Th character.

Let's introduce several definitions:

  • A substringS[I..J] (1? ≤?I? ≤?J? ≤? |S|) Of stringSIs stringSISI? +? 1...SJ.
  • The prefix of stringSOf lengthL(1? ≤?L? ≤? |S|) Is stringS[1 ..L].
  • The suffix of stringSOf lengthL(1? ≤?L? ≤? |S|) Is stringS[|S|? -?L? +? 1 .. |S|].

    Your task is, for any prefix of stringSWhich matches a suffix of stringS, Print the number of times it occurs in stringSAs a substring.

    Input

    The single line contains a sequence of charactersS1S2...S|S| (1? ≤? |S|? ≤? 105)-stringS. The string only consists of uppercase English letters.

    Output

    In the first line, print integerK(0? ≤?K? ≤? |S|)-The number of prefixes that match a suffix of stringS. Next printKLines, in each line print two integersLI CI. NumbersLI CIMean that the prefix of the lengthLIMatches the suffix of lengthLIAnd occurs in stringSAs a substringCITimes. Print pairsLI CIIn the order of increasingLI.

    Sample test (s) input
    ABACABA
    Output
    31 43 27 1
    Input
    AAA
    Output
    31 32 23 1

    Question:

    A string of no more than 10 ^ 5 characters. You need to output the length of the prefix that exactly matches the suffix by length. And the number of times that the prefix appears in the entire string. (Overlapping)

    Ideas:

    The prefix and suffix are displayed at the competition. In my heart, I was overjoyed. Haha. I have learned about Suffix Arrays. This is just a good place. After thinking about it, the algorithm has been formed. The suffix 0 is the entire string. There must be a prefix that matches the common prefix with the entire suffix. Then determine the number of occurrences. When you know that a suffix is a target suffix. You can know its rank value. Then, the suffix that must completely contain a suffix must be followed by it. Based on ranking rules. You want. If the prefix of suffix a contains suffix B. Will a be in front of B? The front is obviously short. So the rest of the work is to determine the maximum distance that can be expanded down. This can be determined based on the value of the height data. Binary + rmq is required. The location is determined by two points. Rmq determines whether the conditions are met. Although the train of thought is correct, it has always been wrong until the end of the game. Only later debugging will I know whether I have a deep understanding of the suffix array. The problem is why the multiplication algorithm requires that txt [n-1] = 0. j = sa [rank [I]-1]; rank [I] = 0. We can solve this problem by adding 0 to the end of the original string. For details, see the code:

    # Include
       
        
    Using namespace std; const int INF = 0x3f3f3f3f; const double eps = 1e-8; const double PI = acos (-1.0); const int maxn = 150010; char txt [maxn]; int sa [maxn], T1 [maxn], T2 [maxn], ct [maxn], he [maxn], rk [maxn], ans, n, m; // sa [I] indicates the starting position of the suffix of the ranking I. Int rmq [25] [maxn], lg [maxn], ansn [maxn], ansp [maxn], ptr; void getsa (char * st) // note that m is an ASCII code in the range of {int I, k, p, * x = T1, * y = T2; for (I = 0; I
        
         
    = 0; I --) // inverted enumeration ensures the relative sequence of sa [-- ct [x [I] = I; for (k = 1, p = 1; p
         
          
    = K) y [p ++] = sa [I]-k; // sort by the second keyword. y [I] indicates the start position of the suffix of the second keyword ranking I for (I = 0; I
          
           
    = 0; I --) sa [-- ct [x [y [I] = y [I]; // sort by the first keyword for (swap (x, y), p = 1, x [sa [0] = 0, I = 1; I
           
            
    > 1] + 1;} void solve () {int low, hi, mid, p, pos, a, B, ans, tp, I; getsa (txt ), gethe (txt), rmq_init (); ptr = 0, pos = rk [0]; for (I = n-2; I> 0; I --) {if (rk [I]
            
              = N-i-1) {ansp [ptr] = p, tp = rk [I] + 1; low = rk [I] + 1, hi = n-1, ans =-1; while (low <= hi) {mid = (low + hi)> 1; if (rmq_min (tp, mid)> = p) ans = mid, low = mid + 1; else hi = mid-1;} ansn [ptr ++] = ans-rk [I] + 1 ;}} int main () {int I; prermq (); while (~ Scanf ("% s", txt) {m = 150, n = strlen (txt); n ++; solve (); ansp [ptr] = n-1; ansn [ptr ++] = 1; printf ("% d \ n", ptr); for (I = 0; I
             
              
    The next step is the second approach. After the competition. The first idea cannot be adjusted. So I went to the group and asked. The result was despised by qijie. If you throw a sentence kmp, you will leave. Think about it. My IQ is deeply despised. Kmp can easily calculate the number of times each prefix appears in the original string. The specific method is to find a mismatch array for the original string. Then match with yourself. If the position I matches the position j, the prefix j appears at the position I. We use cnt [I] to record. The number of times that prefix I appears. Finally, Count cnt [next [I] + = cnt [I]. This is easy to understand. If the prefix j can appear at the position I, next [j] will certainly appear at the position I. Count the number of times each prefix appears in the original string. Now we need to find the number of prefix matching the money fix and suffix. This is simple. Do you match yourself with yourself by matching your first half and your other parts. So we only need to match the n + 1 position to find all the prefixes that match the suffix. The gorgeous O (n) is gone ....

    For details, see the code:

    # Include
               
                
    Using namespace std; const int INF = 0x3f3f3f3f; const double eps = 1e-8; const double PI = acos (-1.0); const int maxn = 150010; char txt [maxn]; int f [maxn], cnt [maxn], ansp [maxn], ansn [maxn], ct, n; void getf (char * p) {int I, j; f [0] = f [1] = 0; for (I = 1; I
                
                 
    0; j --) // Why can this be done. The strings with different endpoints must be different. Kmp ensures different endpoints. If (f [j]) // f [j] indicates the position of the next comparison. Note that f [j]-1 must be the same. Cnt [f [j]-1] + = cnt [J-1]; while (t) // prefix matching suffix {ansp [ct] = t; ansn [ct ++] = cnt [T-1]; t = f [t];} printf ("% d \ n", ct); for (I = CT-1; i> = 0; I --) printf ("% d \ n", ansp [I], ansn [I]);} int main () {while (~ Scanf ("% s", txt) {n = strlen (txt); memset (cnt, 0, sizeof cnt); getf (txt); KMP ();}}
                
               


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.