A simple example of C-language KMP algorithm and its implementation principle explore

A simple example of C-language KMP algorithm and its implementation principle explore _c language

Last Update:2017-01-19 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Previously seen KMP algorithm, at that time after contact always feel good esoteric ah, holding the data structure of the number of chewing a noon, and finally to understand the general, and then mention KMP also only left "Austria, it is to do pattern matching" this dry goods. Recently has the time, turns out the algorithm introduction to look, originally is so simple (does not say the procedure realization, the thought is very simple).

Classic application of pattern matching: Find the location of the pattern strings from a string. As in "ABCdef", "CDE" appears in the third position of the original string. Look at the basics

A simple pattern matching algorithm

A:ABCDEFG B:CDE

First of all, B from the first bit of a, b++==a++, if all set up, return can; if not, jump out, start from the second position of a, and so on.

Copy Code code as follows:

/*
* Houkai, 2014-9-16
* Function: Pattern matching
*/
#include <iostream>
#include <string>
using namespace Std;

int index (char *a,char *b)
{
int tarindex = 0;
while (a[tarindex]!= ' ")
{
int tarlen = Tarindex;
int Patlen;
for (patlen=0;b[patlen]!= ';p atlen++)
{
if (A[tarlen++]!=b[patlen])
{
Break
}
}
if (b[patlen]== ' ")
{
return tarindex;
}
tarindex++;
}
return-1;
}
int main ()
{
Char *a = "abcdef";
Char *b = "CDF";
Cout<<index (a,b) <<endl;
System ("Pause");
}

The idea is plain and efficient, but time complexity is O (MN), and M and N are the lengths of strings and pattern strings, respectively. Pattern matching is a common application problem, with a wide range of people thinking to optimize. Rabin-karp algorithm, finite automata, and so on, and finally came up with KMP (Knuth-morris-pratt) algorithm.

KMP algorithm

Optimization: If we know that the pattern of a and the back is not equal, then after the first comparison, we found that the next 4 characters of the enclosed corresponding to the same, you can see a match in the position of a direct positioning to f. It is not necessary to explain the backtracking of the main string corresponding to position I. This is the most basic and KMP idea and goal of the most important.

Another example:

Since ABC is equal to the following ABC, the red part can be directly obtained. And according to the results of the previous comparison, ABC does not need to be compared, now just start from the f-a to compare it. It is not necessary to explain the backtracking of the main string corresponding to position I. To change is the position of J in the pattern string (J does not have to start with 1, like the second example).

The change of J depends on the similarity of the prefix of the pattern string, in Example 2 ABC and ABC (near x), and the prefix is abc,j=4 to start execution.

J is the number of prefixes in the previous execution of the pattern substring (the first few, 6 in the preceding example) +1; It is related to the previous prefix in the pattern string and the same substring from the backward forward suffix, because the next part of the same prefix is moved to the position of this part of the suffix, because if you move to the previous position of the suffix, look at the image:

So if this is J, the next position should be the length +1 of the maximum prefix of the substring in front of J, and it will be happy to compare this new position with the I position of the original string.

This time is J, the next time in the end is how much, this involves how to calculate the problem? In fact, we can build this j->x relationship just by looking at the pattern string, which is called the prefix function, and the result is stored in the array, called the prefix array.

Pseudo code:

Copy Code code as follows:

Compiter-prefix-function (P)
M<-LENGTH[P]
pi[1]<-0
k<-0
For Q<-2 to M
Do While K>0 and P[k+1]!=p[q]
Do k<-pi[k]//prefix prefix ...
If P[K+1]==P[Q]
Then k<-k+1
Pi[q]<-k
return pi

Using the prefix array enables pattern matching to occur quickly, and the program matches all occurrences of the pattern in the string.

Copy Code code as follows:

Kmp-matcher (T, P)
N<-LENGTH[T]
M<-LENGTH[P]
Pi<-compiter-prefix-function (P)
q<-0
For I<-1 to N
Do While Q>0 and P[q+1]!=t[i]
Do q<-pi[q]//prefix prefix ...
If P[q+1]==t[i]
Then q<-q+1
If Q==m
Then print "pattern occurs with shift" i-m
Q<-PI[Q]

The two pieces of code thought exactly the same, if the prefix is compared with the prefixes ..., more ingenious. If the KMP is difficult to understand, it is estimated that the pseudo code.

The time complexity of the KMP algorithm is O (n+m).

Here needs to emphasize, KMP algorithm only when the pattern and the main string has many partial matches to be able to embody its superiority, the partial match when KMP's I does not need to backtrack, otherwise and the naïve pattern match does not have the difference.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A simple example of C-language KMP algorithm and its implementation principle explore _c language

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A simple example of C-language KMP algorithm and its implementation principle explore _c language

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support