Why is the general algorithm so inefficient? That's because the main string pointer backtracking is too much:
If the main string pointer does not backtrack, the speed will be faster, then we will think:
How do I make the main string pointer not backtracking?
The KMP algorithm solves the problem, so the speed becomes faster.
It is like this:
Use an array: next[] to find the position of the gain and loss match, and then save it.
To be clear about the KMP algorithm, we can start from the simple pattern matching algorithm.
Simple pattern matching algorithm is easy to understand, and its implementation is as follows
int Index (char s[], char p[], int pos)
{
int i, J, Slen, Plen;
I = pos;
J = 0;
Slen = strlen (s);
Plen = strlen (p);
while ((I < slen) && (J < Plen))
{
if (s[i] = = P[J]))
{
i++;
j + +;
}
else
{
I = i-j+1;
J = 0;
}
}
if (J >= Plen)
{
return (I-plen);
}
else
{
return -1;
}
}
It can be seen that in the simple pattern matching algorithm, when the p[j] in the pattern does not match the s[i in the main string, the pointer of the main string must be traced back to the i-j+1 place to compare with p[0. The idea of the KMP algorithm is, can you not retrace the pointers of the main string? The idea is based on the fact that before P[j]!=s[i], p[0]~p[j-1] matches s[i-j]~s[i-1] (here j>0, that is, there is already a matching character before the match. Otherwise, if j=0, then the main string pointer must not backtrack, straight forward into i+1 and p[0]
P[j]!=s[i] before, p[0]~p[j-1] and S[i-j]~s[i-1] is a match, what does this show? This means that the s[i-j]~s[i-1] can be analyzed by analyzing the p[0]~p[j-1 of the pattern. If there is a p[0]~p[k-1]=p[j-k]~p[j-1 in the pattern (a total of k matching characters, and K is the maximum that satisfies the relationship), you can know that s[i-k]~s[j-1] matches [0]~p[k-1], then s[i] just follow p[k It's OK to compare. And this k is not related to the main string, only need to analyze the pattern string to be able to seek out (this is the general teaching material next[j]=k This hypothesis origin, the general textbook always likes to assume this k value already has, if your logical thinking strong has not what, otherwise more or less will put you card in here). namely Next[j]=k.
What if the p[0]~p[k-1]=p[j-k]~p[j-1 string does not exist? This shows that there is no p[0]...=...p[j-1 in the string before P[j], even p[0] is not equal to p[j-1], that is to say, all substrings ending with p[0]~p[j-1 in p[j-1] are mismatching with mode p. Based on the fact above P[0]~p[j-1]=s[i-j]~s[i-1], it can be concluded that all substrings in s[i-j]~s[i-1] end up with the pattern P are mismatch, which means that it is not necessary to trace the pointer of the main string back to the i-j+1~i-1. Since there is no need to backtrack, and S[i]!=p[j], then the s[i can only be compared with p[0]. namely next[j]=0.
In special cases, j=0, i.e. s[i]!=p[0], do not have to use s[i] to compare with the other characters in P, to be compared with s[i+1] with p[0. In order to unify, can let next[0]=-1. In the next round of comparison, judging to the j=-1 situation, let i=i+1,j=j+1, naturally formed s[i+1] and p[0] comparison effect.
KMP Algorithm Implementation Example
Please see the following procedure:
#include <stdio.h> #include <stdlib.h> #include <string.h> #define MAX get_next (int *next,
Char *a,int LA)/* next[] Value */{int i=1,j=0;
NEXT[1] = 0;
while (i <= la)/* Core part/{if (a[i] = = A[j] | | j = = 0) {j + +;
i + +;
if (a[i] = = A[j]) next[i] = Next[j];
else next[i] = j;
else J = Next[j];
an int str_kmp (int *next, char *a, char *a, int la,int lA)/* easy*/{int i,j,k;
i = 1;
j = 1;
while (I<=la && J <= LA) {if (a[i] = = A[j] | | j = = 0) {i + +;
j + +;
else J = Next[j];
} if (j> la) return i-j+1;
else return-1;
int main (void) {int n,k;
int next[max]={0};
int la=0,la = 0;
Char A[max],a[max];
scanf ("%s%s", a,a);
LA = strlen (A);
La = strlen (a);
for (k=la-1; k>= 0; K-) a[k+1] = a[k]; for (K=la-1; k>=0; K-) a[k+1] = a[k];
Get_next (Next,a,la);
K = STR_KMP (Next,a,a,la,la);
if (-1 = k) printf ("Not soulation!!!");
else printf ("%d", k);
System ("pause");
return 0;
}