Assuming that the substring P has m characters, the eigenvector N of the substring p has m nonnegative integers, corresponding to each character one by one.
This means that each character has its own unique number that is used to describe the characteristics of the location.
So, the question is, what is the meaning of this number? In other words, what is the function of this number? To describe what?
Hmm, huh! Knock on the blackboard, pay attention to listen oh.
Assuming that the position is i,n[i] is 5, then 5 means that the string consisting of the first five characters from P is the same as a string from the i position, 5 characters to the left, and 5 characters from the left-to-right one. and, no bigger than 5.
(⊙o⊙) ..., uh, a little crazy. is p[0]p[1]p[2]p[3]p[4] = = P[i-4]p[i-3]p[i-2]p[i-1]p[i] This means ...
OK, here's the next definition, first of all, a few concepts.
//prefix substring : the T-character at the beginning of the template p, called the prefix substring of p, i.e. q[0]q[1]q[3]...q[t-1]
left substring of the//i position : The left side of the position of the template P (including I position), and also the T-character, i.e. Q[i-t+1]...q[i-2]q[i-1]q[i]
Find the longest (T max) I position left substring (the longest prefix string of the I-bit ) that matches the prefix substring, and t is the required characteristic ni.
Then each character in P has such a number, which makes up the P's eigenvector N.
So, the point is, how to calculate the eigenvector of a substring p?
The recursive algorithm is used to calculate the eigenvector of I by assuming that the eigenvector of the i-1 position is known.
Recursion, when I==0, n[0]=0
I > 0 (assuming know n[i-1]==k)
If P[i]==p[k], then n[i]==k+1; (no larger than k+1, otherwise n[i-1]>k)
If P[i]!=p[k], and k!=0; K=n[k-1], loop until it is not satisfied.
If p[i]!=p[k],k==0, then n[i]==0;
If P[i]==p[k], then n[i]==k+1;
Explanation: Here the main is the cycle is more difficult to understand.
When I fail in comparison, we shorten (if growth n[i-1]>k) the longest prefix substring and the left substring of the i-1 position continue to compare, but the shortening is also regular, we want to reduce the two, the longest prefix substring
is cut from the right, i-1 position at the left side of the string is cut from the left, the remaining two ends unchanged, and to ensure that the reduced two strings are still equal, we note that the current longest prefix substring is still equal to the i-1 position of the left substring
The problem is then converted to the current longest prefix substring as a substring, finding its right-most character's eigenvector n[k-1]., determine the new current longest prefix substring, and so on. It is important to note that K means K, which represents the current longest
The prefix substring to the right of a bit, so we still compare p[i] and p[k], but then the K value has changed, K=n[k-1].
The specific algorithm is
int *next (String P)
{
int m = P.strlen ();
M is the length of the template P
ASSERT (M > 0);
If m=0, exit
int *n = new int [m];
Dynamic storage opens up an array of integers
ASSERT (N! = 0);
If the storage area fails to open, exit
N[0] = 0;
for (int i =1; i < m; i++)//analysis of each position of P I
{
The longest prefix string length of position (i-1)
int k = n[i-1];
The following while statement recursively determines the appropriate prefix position K
while (k > 0 && p[i]! = p[k]) k = n[k-1];
Compare K position prefix character according to P[i], decide n[i]
if (p[i] = = P[k])
N[i] = k+1;
else n[i] = 0;
}
return N;
}
Feature Vectors for KMP algorithms