KMP algorithm explanation and kmp Algorithm

Source: Internet
Author: User

KMP algorithm explanation and kmp Algorithm

Old Rules: before talking about algorithms, let's talk about a small problem first.

Give you a long string and a short string, and find the number and position of the short string in the long string.

Set the length of a long string to len1 and the length of a short string to len2.

If len1 * len2 <= 108, it is very simple. The brute-force enumeration can be used to determine whether the strings starting with each character match. The complexity is O (len1 * len2 ); (is it too big ?)

What if we extend the data range to len1 and len2 <-106?

Now we will introduce our KMP algorithm.

With the above problems, what KMP should solve will naturally come out. The complexity of KMP is as high as that of len1 + len2 ).

We can think about what we need to improve relative to brute-force algorithms?

We can skip each mismatch (that is, the matching failure) without moving only one character from the previous starting point!

We can pre-process the position of each hop to save complexity.

Here we will talk about how to jump and how to perform preprocessing.

1. How to jump?

Assume that the string is abaaba.

If we do not match the second a, how should we proceed?

We can place the first a in this position to continue matching.

So what if it is the fourth?

Can we place the second a in this position?

We can see that the strings from the last character to the third a are aba, and the strings from the first character to the second a are also aba. Are they not the same?

At this point, you should probably understand how KMP jumps.

We can record an nxt array. nxt [I] indicates the length of the longest prefix Suffix from the first character to the first character. It doesn't matter if you don't understand it. For example.

Assume that the string is abaaba.

Nxt [0] = 0

Nxt [1] = 0 (AB without prefix suffix)

Nxt [2] = 1 (the longest Suffix of aba is)

Nxt [3] = 1 (abaa -----)

Nxt [4] = 0 (abaab -- none)

Nxt [5] = 3 (abaaba-aba)

2. Initialization

The question is, how can we initialize with a small amount of time complexity?

It's easy to think of recursion. How can we do it?

Let's assume that we have obtained the previous nxt, and now there is one more, we should find it.

We can skip the nxt at the previous position until there is a character after the jump to a position, which is the character behind it.

Every one of them should be recursive!

The search process is similar to the initialization process, so we will not go into details here.

The previous template code below

 

 1 #include<iostream> 2 #include<cstdio> 3 #include<cstring> 4 using namespace std; 5 char s1[1000010],s2[1000100]; 6 int nxt[1000100]; 7 int main() 8 { 9     scanf("%s",s1);10     scanf("%s",s2);11     nxt[0]=0;12     int len1=strlen(s1);13     int len2=strlen(s2);14     for(int i=1,k=0;i<len2;i++)15     {16         k=nxt[i-1];17         while(k>0&&s2[k]!=s2[i])  k=nxt[k-1];18         if(s2[k]==s2[i])  k++;19         nxt[i]=k;20     }21     for(int i=0,j=0;i<len1;i++)22     {23         while(j!=0&&s1[i]!=s2[j])  j=nxt[j-1];24         if(s1[i]==s2[j])  j++;25         if(j==len2)26         {27             printf("%d\n",i-j+2);28         }29     }30     for(int i=0;i<len2;i++) printf("%d ",nxt[i]);31     return 0;32 }

 

Template question: https://www.luogu.org/problemnew/show/3375

Thank you for your support!

I am very grateful for any shortcomings!

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.