The KMP (Knuth–morris–pratt) algorithm that I understand

Source: Internet
Author: User

Suppose you want to match needle in haystack .

To understand KMP first need to understand the two concepts proper prefix and proper suffix, due to find no suitable translation, temporarily called the real prefix and the true suffix.

    • true prefix (Proper prefix): A prefix string that contains at least one trailing character in a string. For example, all true prefixes of "Snape" are "S", "Sn", "Sna", and "snap".
    • true suffix (Proper suffix): A string that contains at least one suffix string that does not contain a single header character. For example, the full true suffix of "Hagrid" is "Agrid", "grid", "RID", "id", and "D".

One of the basic ideas of KMP is that whenever we encounter a match failure, we have matched a portion of the needle string, which is a true prefix of needle, and we can avoid duplicate matches by using this matched true prefix. If you want to understand this basic idea, you can refer to the Wikipedia example Knuth–morris–pratt algorithm example.

After understanding the basic idea above, how to use the real prefix information of needle effectively? This needs to be recorded with an array lofps.

Lofps[i] represents the longest string length in needle[0...i] that is both a true prefix and a true suffix.

The above sentence is a bit around, but it still needs to be understood, because no matter what part of the phrase is missing, the meaning will be incomplete.

1vector<int>Lofps;2 3 /**4 * For all prefix strings in s [0...i], the maximum length of substrings that are both true prefixes and true suffixes, are stored in lofps[i]. 5  *6 * For example, Len = Lofps[i], the true prefix s[0...len-1] and the true suffix s[i-len+1...i] are equal. 7  *8  */9vector<int> Computeprefixsuffix (strings) {Ten      One     //Lofps[i] represents the longest length of a substring that is both a true prefix and a true suffix in the s[0....i] section.  Avector<int>Lofps (S.size ()); -      -     if(s.size () = =0) { the         returnLofps; -     } -      -lofps[0] =0; +      -     intLen =0; +     inti =1; A      at      while(I <s.size ()) { -          -         if(S[i] = =S[len]) { -len++; -Lofps[i] =Len; -i++; in             Continue; -         } to                  +         if(Len! =0) { -             //Use the results of the previous calculations. Here is a point to understand.  the             //locates substrings that also match the S-prefix but shorter, based on the longest length of the equivalent of the computed lofps[len-1] part of the true prefix, the true suffix.  *Len = Lofps[len-1]; $}Else{Panax NotoginsengLofps[i] =0; -i++; the         } +     } A      the     returnLofps; + } -  $  $ intKmpsearch (stringHaystackstringneedle) { -      -     //computes the true prefix of all prefix strings [0...idx] in needle and is the maximum length of the true suffix.  thevector<int>tmp (Needle.size ()); -Lofps =tmp;Wuyi      theLofps =Computeprefixsuffix (needle);  -      Wu     inti =0 ; -     intK =0; About  $      while(I < Haystack.size () && K <needle.size ()) { -         if(Haystack[i] = =Needle[k]) { -i++; -k++; A             Continue; +         } the          -         if(lofps[k-1] !=0) { $K = lofps[k-1]; the             Continue; the         } the          the         if(Haystack[i] = = needle[0]) { -K =1; ini++; the}Else{ theK =0; Abouti++; the         } the     } the  +     if(k = =needle.size ()) { -         returnIK; the}Else{Bayi         return-1; the     } the}

Resources:

Searching for Patterns | Set 2 (KMP algorithm), Geeksforgeeks

The Knuth-morris-pratt algorithm in my own words, jboxer

Worked example in Knuth–morris–pratt algorithm, Wikipedia

I understand the KMP (Knuth–morris–pratt) algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.