KMP string pattern matching algorithm (c + + implementation)

Source: Internet
Author: User

Since the principle is a bit complicated, the detailed principle can refer to this article http://blog.csdn.net/v_july_v/article/details/7041827

This article directly from the conclusion, to meet the exam and competition enough.

Set T as the target string ("Aaabbbaabbabcabcabbaba") and Pat as the pattern string ("Aabbabc").

This is the next array of pattern strings:

J (Subscript) 0 1 2 3 4 5 6
Pat A A B B A B C
NEXT[J] -1 0 1 0 0 1 0

KMP algorithm:

J=0, Next[j]=-1. Indicates that the first 1 characters of the pattern string are aligned with the location of the last mismatch of the target string when the next match is compared. (In fact, the No. 0 character is aligned with the next position of the last mismatch of the target string), the pattern string needs to be moved to the post-next[j] position. (Post is the T-string subscript)

J=1, next[j]=0. Indicates that the No. 0 character of the pattern string is aligned with the location of the last mismatch of the target string when the next match is compared. The pattern string needs to be moved to POST-NEXT[J] locations.

j=2, Next[j]=1. Indicates that the 1th character of the pattern string is aligned with the location of the last mismatch of the target string when the next match is compared. The pattern string needs to be moved to POST-NEXT[J] locations.

etc...

  

So the following only requires the next array, how is the next array formed?

Starting at subscript 0, until LengthP-1 (LENGTHP is the length of the pattern string), each time the subscript is searched for the same maximum length as the suffix (the prefix does not include the entire string before it, that is, the starting position and the terminating position are equal to the same string, explained below).

J=0, the character A is not preceded by a character, so mark-1;

J=1, the character a precedes the character a, but because the "prefix does not include the preceding whole string" rule, it does not have the same prefix, so it is marked as 0.

J=2, the character B is preceded by a character AA, and the same string as the prefix is a, so the length of the prefix is 1.

etc... (PS: The calculation of the prefix is left to right)

In fact, this is to facilitate the understanding of the next array, and the actual formation of the next array is also a KMP algorithm, it is also a matching string process, with the suffix to match the process of the prefix.

The code is as follows:

  

1#include <iostream>2#include <string>3 using namespacestd;4 stringT;5 stringPat;6 voidGetNext (intNext[],intLENGTHP) {//LENGTHP is the length of the pattern string P7     intj=0, k=-1;//J is the subscript for the P-string, and K is used to record the value of the next array corresponding to the subscript.8next[0]=-1;//The next array value under initialization 0 subscript is-19      while(J&LT;LENGTHP) {//Scan a pattern stringTen         if(k==-1|| Pat[j]==pat[k]) {//The string suffix does not have the same substring as the prefix or the character under the J subscript and the word typeface under K.  Onej++;k++;  ANext[j]=k;//Set Next array J below value to K -}Else -K=NEXT[K];//narrowing the range of substrings continues to compare the     } - } -  - intKmpintKintnext[]) { +     intposp=0, post=k;//The Posp and Post are the subscripts of the pattern string pat and the target string T, first initializing their starting position -     intLengthp=pat.length ();//LENGTHP is the pattern string pat length +     intLengtht=t.length ();//lengtht is the target string T length A      while(POSP&LT;LENGTHP&AMP;&AMP;POST&LT;LENGTHT) {//for two-string scanning at         if(posp==-1|| Pat[posp]==t[post]) {//corresponding character matching -posp++;p ost++; -}Else -POSP=NEXT[POSP];//when mismatched, select the next matching position with the next array value -     } -     if(POSP&LT;LENGTHP)return-1; in     Else returnPOST-LENGTHP;//Match Success - } to  + intMain () { -t="Aaabbbaabbabcabcabbaba"; thepat="AABBABC"; *     intlengthp=pat.length (); $     intnext[lengthp]={0};Panax Notoginseng GetNext (NEXT,LENGTHP); -     intPOS=KMP (0, next); thecout<<pos<<Endl; +cout<<"next[]:"; A      for(intI=0; i<lengthp;i++){ thecout<<next[i]<<" ";  +     }  -     return 0; $}

  

  

  

  

  

KMP string pattern matching algorithm (c + + implementation)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.