Extending the KMP template

Source: Internet
Author: User

Extended KMP:
The template string A and substring B are given, the lengths are LenA and LenB, which are required in linear time, for each a[i] (0 <= i < LenA), the a[i is obtained. LENA-1] and b the longest common prefix length, recorded as Ex[i] (or, ex[i] for satisfying a[i. The maximum z-value of i + z-1]==b[0. z-1].
Extension KMP can be used to solve many string problems, such as finding the longest palindrome substring of a string and the longest repeating substring.
Algorithm
Set Next[i] to meet B[i. i + z-1] = = B[0..z-1] The maximum Z-value (that is, the self-matching of B).
The current next[0..lenb-1] and ex[0..i-1] have been calculated to use them to find the value of ex[i].
Set p to the farthest position in the current a string, K for the value to match to the farthest position (or K is in 0 <= I0 < I of all i0 values, so that I0 + ex[i0]-1 of the value of the largest one, p for this maximum, that is K + ex[k]-1), obviously, all the bits after P are is unknown, that is, there is no way to know whether any one of A[p + 1..lena-1] is equal to any of the B's.
According to the definition of ex, a[k. P] = = B[0..p-k], because i > K, so there are a[i. P] = = B[i-k. P-K], set L = Next[i-k], according to next definition has b[0..l-1] = = B[i-k. I-k + L-1].
Consider the relationship between I-k + L-1 and p-k:

(1) i-k + L-1 < P-k, i.e. i + L <= p.
At this time, by A[i. P] = = B[i-k. P-K] can get a[i. i + L-1] = = B[i-k. I-k + L-1], and because b[0..l-1] = = B[i-k. I-k + L-1] so a[i. i + L-1] = = B[0..l-1], this means ex[i] >= L.
And because next is defined, A[i + L] must not be equal to b[l] (otherwise a[i). i + L]==B[0..L], because i+l<=p, so a[i. i + L] = = B[i-k. I-k + L], so b[0..l] = = B[i-k. I-k + L], so the value of next[i-k] should be L + 1 or greater), so that you can directly get ex[i] = L!

(2) i + K + 1 >= p-k, i.e. i + L > P.
At this point, you can first know a[i. P] and b[0..p-i] are equal (because of a[i. P] = = B[i-k. P-k], while i + K-L + 1 >= p-k, by b[0..l-1] = = B[i-k. I-k + L-1] can get b[0..p-i] = = B[i-k. P-k], namely a[i. P] = = B[0..p-i]), and then, for A[p + 1] and B[p-i + 1] are equal, is not currently known (as previously said, P is currently the longest match in a string to the furthest position, after P can not know any one of the matching information), therefore, from A[p + 1] and b[p- i + 1] start to continue the match (J is the subscript of the current B's match position, starting with j = p-i + 1, each time comparing A[i + j] and B[j] is equal, until unequal or out of bounds, at which time the J value is the value of ex[i]. In this case, the value of P is bound to be extended, so the values of K and P are updated.

Boundary:The value of ex[0] needs to be calculated in advance and then set the initial K to 0,p to Ex[0]-1.
For the next array, it is also "self-matching", similar to the KMP method of processing. The only difference is also on the boundary: You can directly know that the value of next[0] = Lenb,next[1] is pre-calculated, and then the initial k = 1,p = ex[1].

Serious note: In the above case (2), it should start to match from A[p + 1] with B[p-i + 1], however, if p + 1 < I, i.e. P-i + 1 < 0 (This situation is likely to occur when ex[i-1] = 0, and the previous ex value does not extend to I and later time), it is necessary to add a, B subscript 1 (because at this time p must be equal to i-2, if a, b subscript with two variables x, y control, X and Y to add 1)!!

Template code:

#include <iostream>
#include <cstring>

using namespace Std;

const int n = 500004;
int next[n];
int extend[n];
Char S[n];
Char T[n];

void GetNext (char* T)
{
int k = 0;
int Tlen = strlen (T);
Next[0] = Tlen;
while (K < Tlen-1 && T[k] = = t[k + 1])
k++;
Next[1] = k;
K = 1;
for (int i = 2; i < Tlen; i++)
{
int p = k + next[k]-1, L = Next[i-k];
if (i + L-1 >= p)
{
Int J = (p-i + 1) > 0? P-i + 1:0;
while (i + J < Tlen && t[i + j] = = T[j])
j + +;
Next[i] = j;
K = i;
}
else next[i] = L;
}
}

void Getextend (char* S, char* T)
{
int k = 0;
GetNext (T);
int slen = strlen (S);
int Tlen = strlen (T);
int Minlen = Slen < Tlen? Slen:tlen;
while (K < Minlen && S[k] = = T[k])
k++;
Extend[0] = k;
k = 0;
for (int i = 1; i < Slen; i++)
{
int p = k + extend[k]-1, L = Next[i-k];
if (i + L-1 >= p)
{
Int J = (p-i + 1) > 0? P-i + 1:0;
while (i + J < Slen && J < Tlen && s[i + j] = = T[j])
j + +;
Extend[i] = j;
K = i;
}
else extend[i] = L;
}
}

Extending the KMP template

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.