A little note on the extended KMP algorithm

Source: Internet
Author: User

Reference from "Expanding KMP algorithm Summary": http://blog.csdn.net/dyx404514/article/details/41831947

Extended KMP resolves the issue:

Defines the length of the mother string s and substring t,s to be n,t m;

The longest common prefix of the string T with each suffix of the string s;

That is, there is a extend array: Extend[i] represents the longest common prefix of T and S[i,n-1], requiring all Extend[i] (0<=i<n).

(Note that if there are several extend[i]=m, it means that T appears completely in S, and it appears in position I, which is the standard KMP problem, so it is generally referred to as the extended KMP algorithm.) )

Here is an example of s= "Aaaabaa", t= "AAAAA";

First, when calculating extend[0], 5 matches are required until a mismatch occurs: thus knowing the extend[0]=4;

The following calculation extend[1], when calculating extend[1], do you still need to start from scratch when calculating extend[0]?

The answer is no, because by calculating extend[0]=4, which can be obtained s[0,3]=t[0,3], further can get s[1,3]=t[1,3];

When calculating extend[1], it is actually starting from s[1] to match;

With auxiliary array Next[i] represents the longest common prefix length for t[i,m-1] and T,

In this example, next[1]=4, t[0,3] = t[1,4], further t[1,3]=t[0,2], so s[1,3]=t[0,2], so in the calculation extend[1], by the calculation of extend[0], already know s[1,3]=t [0,2];

So the previous 3 characters do not need to match, directly match s[4] and t[3] can, at this time there is mismatch, so extend[1]=3.

1. General steps to expand the KMP algorithm

By the above example, the idea of expanding the KMP algorithm is actually embodied, and the following describes the general steps to expand the KMP algorithm.

First we calculate the extend array from left to right, in a moment, set EXTEND[0...K] has been calculated, and the previous matching process reached the farthest position of P, the so-called farthest position, strictly speaking is the maximum value of i+extend[i]-1 (0<=i< =K);

Set the position to this maximum value for the Po, as in the example above, calculate extend[1], p=3,po=0.

Now to calculate the extend[k+1], according to the definition of the extend array, you can infer S[po,p]=t[0,p-po], so as to get S[k+1,p]=t[k-po+1,p-po], so len=next[k-po+1] (Next[i] means T The longest common prefix length of [i,m-1] and T);

The following two scenarios are discussed:

First case: K+len<p

As shown in the following:

Medium, s[k+1,k+len]=t[0,len-1],

Then s[k+len+1] must not be equal to T[len], because if they are equal, there is S[k+1,k+len+1]=t[k+po+1,k+po+len+1]=t[0,len], then next[k+po+1]=len+ 1, this is inconsistent with the definition of the next array;

So in this case, without any matching, you know extend[k+1] = len.

An example of a simulation:

  

Second case: K+len>=p

Such as:

, S[p+1] After the characters are unknown, that is, the string has not been matched, so in this case, it is necessary to start from s[p+1] and t[p-k+1] to match, until the mismatch occurs, when the match is complete, if the resulting extend[k+1]+ (k+1) Greater than p to update the unknown p and PO.

Another example of simulation:

At this point, the process of expanding the KMP algorithm has been described, and the reader may find out how the next array is calculated and not explained;

In fact, the process of computing the next array is exactly the same as the process of calculating extend[i], which is treated as a T-string and a special extension KMP algorithm for the substring.

The required next array values in the calculation are all computed, so the next array is calculated according to the algorithm described above without any problems.

Code Template:

Const intmax=100010;//Maximum string lengthintNext[max],extend[max];//preprocessing computes next arrayvoidGetNext (Charstr[]) {    intI=0, j,po,len=strlen (str); next[0]=len;//Initialize Next[0]     while(str[i]==str[i+1] && i+1<len) i++; next[1]=i;//calculation next[1]po=1;//Initialize the location of the PO     for(i=2; i<len;i++)    {        if(Next[i-po]+i < Next[po]+po)//in the first case, you can get the value of Next[i] directly.next[i]=next[i-PO]; Else //in the second case, to continue the match to get the value of Next[i]{J= next[po]+po-i; if(j<0) j=0;//if I>PO+NEXT[PO], you want to match from the beginning             while(I+j<len && Str[j]==str[j+i]) j + +; next[i]=J; PO=i;//Update the location of the PO        }    }}//Calculating Extend ArraysvoidEXKMP (CharS1[],Chars2[]) {    intI=0, J,po,len=strlen (S1), l2=strlen (S2); GetNext (S2); //next array of computed substrings     while(S1[i]==s2[i] && i<l2 && i<len) i++; extend[0]=i; PO=0;//Initialize the location of the PO     for(i=1; i<len;i++)    {        if(Next[i-po]+i < Extend[po]+po)//in the first case, you can get the value of Extend[i] directly.ex[i]=next[i-PO]; Else //in the second case, to continue the match to get the value of Extend[i]{J= extend[po]+po-i; if(j<0) j=0;//if I>extend[po]+po is going to match from the beginning             while(I+j<len && j<l2 && s1[j+i]==s2[j]) j + +; extend[i]=J; PO=i;//Update the location of the PO        }    }}    

A little note on the extended KMP algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.