The daily walkthrough of the classic Algorithm problem--the seventh problem KMP algorithm

Source: Internet
Author: User

The original: A daily walkthrough of the classic Algorithm--the seventh KMP algorithm


In the university, should be in the data structure have seen the KMP algorithm, do not know how many teachers to the algorithm is a stroke, at least we were before,

Really KMP algorithm is still a bit forgive, if the red black tree is a perverted class, then KMP algorithm than red black tree also perverted, sorry, every time you hit KMP, lose

Into the law always prompt "look at the pornography" Three words, hey, it is called "See the algorithm" bar.

One: BF algorithm

If you write a string pattern match, you may soon write the naïve BF algorithm, at least the problem is solved, and I think it is clear to everyone that it's time

The impurity is O (MN), the reason is very simple, when the main string and the pattern string mismatch, we always compare the first bit of the pattern string with the next character of the main string, so complex

The degree of high in the main string every time the mismatch is to backtrack, the figure I omitted.

Two: KMP algorithm

Just now we also said that the main string to backtrack every time, thereby increasing the complexity of time, then can not be in the "main string" and "pattern string" mismatch, the main string does not backtrack it?

Instead, let the "pattern string" slide a certain distance to the right, and then proceed to the next round after the number, so that the time complexity is O (m+n)? So the KMP algorithm is

To handle this, let's look at a simple example.

With this diagram, let's discuss its general reasoning, assuming that the main string is s, the pattern string is P, and in si! = PJ, we can see that the following relationship is met

Si-jsi-j+1...sn-1=p0p1. Pj-1. So how much distance should the mode p slide to the right? That is, is the I character in the main string compared to which character in the pattern string?

Suppose it should be compared to the position of k in the pattern string, if there is a maximum prefix true substring and suffix true substring in the pattern string, then there is p0p1. Pk-1=pj-kpj-k+1...pj-1.

In other words, in mode p, the first k characters are the same as the K characters before J characters, for example, the maximum prefix of "Abad" is "ABA", the maximum

Suffix true substring is "bad", of course, here is not equal, here 0<k<j, we hope K close to J, then we will slide the minimum distance, OK, now we use

NEXT[J] to record the mismatch when the pattern string should be compared to SI with which character.

Set Next[j]=k. According to the formula we have

-1 when J=0

NEXT[J] = max{k| 0<k<j and p0p1. PK-1=PJ-KPJ-K+1...PJ-1}

0 other conditions

OK, the next question is how to find out next[j], this is the core of KMP thought, for NEXT[J], we use recursive method, now we know

Next[j]=k, we're here to beg next[j+1]=? The problem? In fact, there are two kinds of situations:

①:PK=PJ p0p1 ... PK=PJ-KPJ-K+1...PJ, then we know:

Next[j+1]=k+1.

And because of the next[j]=k,

Next[j+1]=next[j]+1.

②:PK!=PJ p0p1 ... PK!=PJ-KPJ-K+1...PJ, this situation we have a bit of egg pain, in fact, here we will be the pattern string matching problem into the above we mentioned

In the "main string" and "pattern string" to find the next question, you can understand the pattern string in the prefix string and suffix string to find next[j] problem. Now our train of thought is certain

To find this K2, make PK2=PJ, and then K2 into the ① will be able to.

Set K2=next[k]. Then there are p0p1 ... Pk2-1=pj-k2pj-k2+1...pj-1.

If PJ=PK2, then next[j+1]=k2+1=next[k]+1.

If PJ!=PK2, you can continue to use next recursively as above until there is no K2.

OK, below we on the code, may be a bit around, whether you understand or not, anyway, I understand.

1 usingSystem;2 usingSystem.Collections.Generic;3 usingSystem.Linq;4 usingSystem.Text;5 6 namespacesupportcenter.test7 {8      Public class Program9     {Ten         Static voidMain (string[] args) One         { A             stringZstr ="ABABCABABABDC"; -  -             stringMSTR ="BABDC"; the  -             varindex =KMP (Zstr, MSTR); -  -             if(Index = =-1) +Console.WriteLine ("there are no matching strings! "); -             Else +Console.WriteLine ("haha, find the character, the position is:"+index); A  at Console.read (); -         } -  -         Static intKMP (stringBIGSTR,stringsmallstr) -         { -             inti =0; in             intj =0; -  to             //next that calculates "prefix string" and "suffix string " +             int[] Next =Getnextval (SMALLSTR); -  the              while(I < Bigstr. Length && J <Smallstr. Length) *             { $                 if(j = =-1|| Bigstr[i] = =Smallstr[j])Panax Notoginseng                 { -i++; theJ + +; +                 } A                 Else the                 { +j =Next[j]; -                 } $             } $  -             if(J = =Smallstr. Length) -                 returnISmallstr. Length; the  -             return-1;Wuyi         } the  -         /// <summary> Wu         ///p0,p1....pk-1 (prefix string) -         ///pj-k,pj-k+1....pj-1 (suffix string) About         /// </summary> $         /// <param name= "Match" ></param> -         /// <returns></returns> -         Static int[] Getnextval (stringsmallstr) -         { A             //prefix string start position ("1" is convenient to calculate) +             intK =-1; the  -             //suffix string start position ("1" is convenient for calculation) $             intj =0; the  the             int[] Next =New int[Smallstr. Length]; the  the             //according to the formula: J=0, Next[j]=-1 -NEXT[J] =-1; in  the              while(J < Smallstr. Length-1) the             { About                 if(k = =-1|| Smallstr[k] = =Smallstr[j]) the                 { the                     //situation of PK=PJ: next[j+1]=k+1 = next[j+1]=next[j]+1 theNEXT[++J] = + +K; +                 } -                 Else the                 {Bayi                     //PK! = PJ's situation: we recursive k=next[k]; the                     //either find it, or k=-1 abort. theK =Next[k]; -                 } -             } the  the             returnNext; the         } the     } -}

The daily walkthrough of the classic Algorithm problem--the seventh problem KMP algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.