On the string algorithm (KMP algorithm and manacher algorithm)

Source: Internet
Author: User

"String algorithm 1" string hash (elegant brute force) "string algorithm 2" manacher algorithm "string algorithm 3" KMP algorithm here will tell the string algorithm 2:manacher algorithm problem: Give the string s (limit see) to find the maximum palindrome substring length Subtask1 For 10% of Data | S|∈ (0,100]subtask2 for 30% data | S|∈ (0,5000]SUBTASK3 for 100% data | S|∈ (0,11000000]

Subtask1 (10pts): The most austere violence   enumerates all substrings of a string, judging whether it is a palindrome , the time complexity is o (n3)

< Span class= "base" >< Span class= "msupsub" >subtask2 (30pts): Using the nature of palindrome string:    < Span class= "Mord" >< Span class= "Vlist" > All palindrome strings are symmetrical.

length is odd palindrome The position of the string with the most intermediate character is symmetrical around the axis of symmetry,

with even length The gap between two characters in the middle of the symmetric axis of a palindrome string.

You can traverse these pairs Axis,

on each axis of symmetry Expand left and right at the same time until the characters on both sides of the left are different or reach the boundary.

time complexity O ( N2)

Subtask 3 (100pts):? (Of course the Manacher algorithm children) O (n)

improved optimization Subtask2 (30pts):

1. Degree of difficulty (preprocessing string)

Insert another character, such as ' # ', in the middle of every two characters.

In other words, the original string is like this

Now the string is like this (for the sake of not crossing and convenient, A[0] is set to ' # ')

Then this shortcoming is solved (now we don't have to divide the odd number to discuss the QWQ)

"Where did the picture come from, enough to explain the problem?"

2. Many substrings are repeatedly accessed and time efficiency is greatly reduced.

< Span class= "base" >< Span class= "msupsub" > to reduce dimension O (n) The practice is that each person sweeps only once or a few times, and every one of them sweeps nearly n times.

------------------(body split line)----------------------

Define several variables P[i] indicates that the first point in the processed array is the longest extension to the left P[i] digits can guarantee that he is a palindrome (right is not recorded because of the symmetry)

So p[1]=1,p[2]=2,p[3]=1,p[4]=2,p[5]=1,p[6]=2,p[7]=5,p[8]=2,p[9]=1,p[10]=2,p[11]=1,p[12]=2,p[13]=1;

You can find out where all the # is, mostly 1.

in fact, the above sentence is wrong!!! p[7]=5???

Here's the point:

What does p[i]-1 represent? The length of the palindrome string (including the middle point) at position I in the original string (without inserting #)

Prove:

1, obviously l=2*p[i]-1 is the new string (plus #) in the first point as the center of the longest palindrome string length.
2. The palindrome string centered on the I-bit must end with # beginning and #, for example "#b #b#" or "#b #a#b#"
So L minus the first or last ' # ' character is twice times the length of the original string, i.e. the original string length is (L-1)/2, simplifying the p[i]-1.

Evidence.

"Steal 2 more Pictures"

Maxid indicates the right end of the maximum palindrome string in the previous run

The ID (the next program is mid) represents the symmetric point of the maximum palindrome at this time , and I represents the current traversal to the I position

Obviously, we can easily work out the coordinates J of I at the ID (MID) symmetry point

By the ID is the midpoint of the IJ (i+j)/2=id, so j=2*id-i(so the program runs without having to record J directly with ID and I counted on the line)

Using the idea of dynamic programming when calculating P[i] P[j] | J∈[1,i) has been worked out so that if J is a symmetric point about ID (MID) then p[i]∈[p[j],+∞)

is p[i] shorter than p[j], why not?

Using the symmetry of palindrome string J and I about ID (mid) symmetry, Maxid and left MiniD (for the ID (mid) symmetry point) about ID symmetry

Then j to ID palindrome string radius is p[j], symmetrical over to the left of the ID right I then the palindrome string radius is also p[j],

and P[i] did not work out so that is P[i] can not be less than p[j] short is p[i]∈[p[j],+∞)

Category discussion:

If symmetry comes over maxid then such symmetry is illegal p[j]=1 from that point honestly to both sides of the expansion

is p[j] symmetrical to I of about, I right end more than MAXID (touch should not touch place), is illegal, because right you do not know

If the symmetry is not over Maxid then the symmetry is legal p[i]=p[j] then expand to the right

You can update Maxid and p[i during the move.

Here you have finished SUBTASK3 data for 100% | S|<10000000 's data

Proof of complexity?

The manacher algorithm only needs a linear scan over the preprocessed string.
Handling of p[] array i is J symmetric point about ID

1. (I<MAXID)

    • MAXID-I>P[J] P[j]=p[i]
    • Other conditions P[i]=maxid-i

2. Other circumstances p[i]=0

1. In the case of I<MAXID, the value of P can be determined within O (1) time
2. In the case of I>MAXID, the value of P needs to be determined in the time of O (n),

However, in case 2, each scan starts from Maxid, and the change of MAXID itself is monotonically increasing ,

This guarantees that each character in the string is accessed up to 2 times ,

Therefore, the time complexity of the algorithm is linear O (n)

Just need to figure out two points:

1.while () The time complexity of the loop itself is indeed O (n) without preconditions

2. But the maxid here is constantly moving backwards, and it is not possible to move forward, and its value changes incrementally.

Then you can understand that to go into the while loop,

The value of I must be greater than MAXID,

That means the entire program ends,

The while loop performs an operand of n (linear) time,

Each character in the string can be accessed up to 2 times.

The complexity of time must be O (n)

Paste the code:

# include <bits/stdc++.h>using namespacestd;Const intmaxn=11000005*2;Chara[2*MAXN];intp[2*MAXN];intMain () {CharCh=getchar ();intt=0; a[0]='?';//for the sake of insurance a[0] and the last symbol cannot be the same     while(Isalpha (ch)) {T++; a[2*t]=ch; a[2*t-1]='#'; CH=GetChar (); } a[2*t+1]='#'; intn=2*t+1; intMid=0, r=0I//mid is id,r or maxid,i is I     for(i=1; i<=n;i++) {        if(r>i) P[i]=min (p[2*mid-i],r-i); Elsep[i]=1;  while(A[i-p[i]]==a[i+p[i]]) p[i]++; if(I+p[i]>r) R=i+p[i],mid=i; }    intans=0;  for(intI=1; i<=n;i++) Ans=max (ans,p[i]-1); printf ("%d\n", ans); return 0;}
This is a real template problem qwq:p3805 "template" Manacher algorithm title description

A string s consisting of a lowercase English character a,b,c...y,z is given, and the length of the longest palindrome in S is obtained.

String length is n

Input/output format

Input format:

A line of lowercase English characters a,b,c...y,z a string s

Output format:

An integer representing the answer

Input and Output Sample input example # #:
Aaa
Sample # # of output:
3
Description

String Length | s| <= 11000000

Here is a note: Can not judge the carriage return or tle, should judge not the letter when the decisive jump

On the string algorithm (KMP algorithm and manacher algorithm)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.