1. What is KMP algorithm?
The knuth-Morris-Pratt string search algorithm (often referred to as the "KMP algorithm") is used in a "main text string"SSearch for a "word" in"WTo avoid re-checking the previously matched characters. (Match the pattern string in the original string)
II,KMP demo
Http://staff.ustc.edu.cn /~ Ypb/jpk/ Flash/find_kmp.swf
Iii. KMP principles
KMP is the most common improved algorithm. It can effectively jump a few characters to the backend when the matching process is not matched, thus speeding up the matching.
In the KMP algorithm, there is an array called a prefix array, and some are called. Each pattern string has a fixed next array. Of course, it also describes the degree of symmetry of the substring, the higher the degree, the larger the value. Of course, the chance of re-matching may be greater before.
For an understanding of the next array, see http://blog.csdn.net/yearn520/article/details/6729426#t0
[CPP]View plaincopy
- Void setprefix (const char * pattern, int prefix [])
- {
- Int I;
- Int Len = strlen (pattern); // The length of the pattern string.
- Prefix [0] = 0;
- For (I = 1; I <Len; I ++)
- {
- Int K = prefix [I-1];
- // Recursively determine whether sub-symmetry exists. If K is set to 0, sub-symmetry is no longer available. Pattern [I]! = Pattern [k] indicates that although symmetric, the value after symmetric is not equal to the current character value, so the recurrence is continued.
- While (pattern [I]! = Pattern [k] & K! = 0) // when I is equal to 14, evaluate the value of prefix [14]
- K = prefix [k-1]; // continue Recursion
- If (pattern [I] = pattern [k]) // finds the sub-symmetry, or directly inherits the symmetry above, both of which are based on ++
- Prefix [I] = k + 1;
- Else
- Prefix [I] = 0; // if all the sub-symmetry is traversed, this new character is not symmetric.
- }
- Prefix [0] =-1;
- }
Iv. test example
[CPP]View plaincopy
- # Include <iostream>
- # Include <cstring>
- # Include <string>
- Using namespace STD;
- String SORG;
- String spat;
- Int prefix [10000];
- Int result [20];
- Void Init ()
- {
- SORG = "";
- Spat = "";
- Memset (prefix, 0, sizeof (INT) * 10000 );
- // Memset (result, 0, sizeof (INT) * 20 );
- }
- Void setprefix (string temp, int next [])
- {
- Int Len = temp. Size ();
- Next [0] = 0;
- For (INT I = 1; I <Len; I ++)
- {
- Int K = next [A I-1];
- While (temp [k]! = Temp [I] & K! = 0)
- K = next [k-1];
- If (temp [k] = temp [I])
- Next [I] = k + 1;
- Else
- Next [I] = 0;
- }
- Next [0] =-1;
- For (INT I = 0; I <Len; I ++)
- If (next [I]> = 1)
- Next [I] = next [I]-1;
- }
- Int KMP (string S1, string S2)
- {
- Int number = 0;
- Int I = 0;
- Int J = 0;
- While (I <(INT) s1.size () & J <(INT) s2.size ())
- {
- If (j =-1 | S1 [I] = S2 [J])
- {
- I ++;
- J ++;
- }
- Else
- J = prefix [J];
- If (j = s2.size ())
- {
- I = I-j + 1;
- J = 0;
- Number ++;
- }
- }
- Return number;
- }
- Int main ()
- {
- Int T;
- Cin> T;
- Memset (result, 0, sizeof (INT) * 20 );
- For (INT I = 0; I <t; I ++)
- {
- Init ();
- // Cout <"input spat string:" <Endl;
- Cin> spat;
- // Cout <"input SORG string:" <Endl;
- Cin> SORG;
- Setprefix (spat, prefix );
- Result [I] = KMP (SORG, spat );
- }
- For (Int J = 0; j <t; j ++)
- Cout <result [J] <Endl;
- Return 0;
- }
KMP algorithm Learning