"Data Structures and Algorithms": KMP Pattern matching algorithm

Source: Internet
Author: User

Knuth-morris-pratt string Lookup algorithm, referred to as "KMP algorithm", is often used in a text string s to find the location of a pattern string p, the algorithm by Donald Knuth, Vaughan Pratt, James H. The Morris Trio was published in 1977, so the 3-person surname was named for the algorithm. the whole point of the KMP is that when a character does not match the main string, we should know where the J pointer is moving.


C and D don't match, where do we move J? This is obviously the 1th position, as the blue box below shows that the previous a has been matched.


The following is also true:


You can move the J pointer to the 2nd position because there are two letters in the front.


So now we know that when the match fails, J is going to move the next position K. There is such a nature: the first k characters and the last K characters before J are the same.

The mathematical formula is expressed as follows:
p[0 ~ k-1] = = P[j-k ~ J-1]

The next array can be evaluated by recursion, as shown in the following code:

onvoidNext(Char*P,intNext[])
Geneva{
GenevaintPlen=strlen(p);
Geneva Next[0]=-1;
tointk=-1;
.intJ=0;
- while(J<Plen-1)
,{
the //p[k] Represents a prefix, p[j] represents a suffix
Tenif(k==-1|| P[J]==P[k])
One {
A++k;
-++J;
-Next[J]=k;
the}
-Else
-{
-k=Next[k];
+}
-}
+}

The value of Next[j] (that is, K) indicates the next move position of the J pointer when p[j]! = T[i].

When J is 0 o'clock, if this time does not match, J already on the leftmost, can not move again, this time should be I pointer back move. So in the code there will be next[0] =-1; this initialization.


P[K]! = P[j]


From the code it should be this sentence: K = next[k]; Why is it like this? You see, it should be understood below.


With the next array it's all good to do, we can write the KMP algorithm:

on intKmpmatch( Char*S, Char*P
Geneva{
Geneva intI=0;
Geneva intJ=0;
to intSlen=strlen(s);
. intPlen=strlen(p);
- while(I<Slen&&J<Plen)
,{
the //① If J =-1, or if the current character match succeeds (ie s[i] = = P[j]), the i++,j++
Ten if(J==-1|| S[I]==P[J])
One{
AI++;
-J++;
-}
the Else
-{
- //② If J! =-1, and the current character match fails (that is, s[i]! = P[j]), then I is unchanged, j = Next[j]
- //next[j] is the next value corresponding to J
+J=Next[J];
-}
+}
A if(J==Plen)
at returnI-J;
- Else
- return-1;
-}

Note: The KMP algorithm is the classic algorithm, the principle and some block diagram refer to the online technology blog, here into some of their own understanding of the record, in order to deepen their understanding of the algorithm, at the same time with the vast number of netizens to communicate, welcome criticism!

Data structures and algorithms: KMP pattern matching algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.