First, understand the next array
1, agreed Next[0]=-1,
At the same time, it can be assumed that there is a wildcard "*" at the front of the sub string, which can be matched arbitrarily. Corresponds to the actual code t<0 when the case is processed.
2, Next[j] can have the following several understanding ideas:
1) next[j] The maximum matching length of the prefix string for the string preceding the sub[j]
For example sub= "ABABAP"
Next[5]=3, the match string is "ABA" before and after
2) after the sub[j] position mismatch fails, next[j] is the position where pointer J of the sub string can be traced back to.
3) Next[j] for the longest prefix to match the next character position of the string (this is why it is necessary to have t=next[t] This step for the next array. )
It is not difficult to find:
The greater the value of next[j], the smaller the cross-length of the sub string's position pointer J needs to be traced if the match fails at Base[i] and Sub[j].
Instead
NEXT[J], the smaller the value of base[i] and Sub[j], the smaller the position pointer J of the sub string needs to retrace the length of the span.
In extreme cases, next[j] is the position pointer J of the 0,SUB string directly back to the start position of the sub string.
Second, understanding KMP main algorithm
1, base string position pointer I in the process of matching is always not forward backtracking, which is the KMP algorithm compared to brute force matching algorithm efficient reason.
2, when Base[i] and Sub[j] match fails, the position pointer J of the sub string is backward, j is smaller, which is equivalent to moving the sub string to the right.
The position of J back to Next[j].
Third, understand the improved next array
Improved value optimization algorithm for next array:
if (Sub.charat (t)! = Sub.charat (j)) {= t;} Else { = next[t];}
Consider the following for the base main string and the sub string:
String base = "AAAABCDE";
String sub = "Aaaaax";
Using an improved next array value of [ -1,-1,-1,-1,-1,4]
When b=base[4]! = sub[4]=x, j=next[j]=-1 jumps directly to the position of the Sentinel "*" of the sub string, then enters J<0, then i++,j++, omitting the steps of the layer backtracking.
The principle is equivalent to simplifying the jump condition T = next[t] of the sub position pointer J in the KMP main algorithm;
Because base[i]! = Sub[j] In the KMP main algorithm, J after the first backtracking, if there is sub[[next[j]]]=sub[j], it is not difficult to infer sub[[next[j]]]=sub[j]!=base[i], So this time backtracking is no practical effect, J will also have to go backwards ... Based on such considerations, the next array is directly optimized to avoid such layers of backtracking in the main algorithm, which can reduce the number of while loops in the main algorithm.
The improved next array avoids the position-pointer J-layer of the sub-string to backtrack forward, ensuring that the backtracking of each j is valid.
Four, Java implementation is as follows
1 Packageagstring;2 3 Public classKMP {4 Public Static int[] getnextary (String sub) {5 intSublenght =sub.length ();6 int[] Next =New int[sublenght];7 intt = next[0] = -1,j = 0;8 while(J < SubLenght-1){9 if(T < 0 | | Sub.charat (t) = =Sub.charat (j)) {Tent++; OneJ + +; ANEXT[J] = t;//can be optimized -}Else { -t =Next[t]; the } - } - returnNext; - } + Public Static int[] Getnextaryext (String sub) { - intSublenght =sub.length (); + int[] Next =New int[sublenght]; A intt = next[0] = -1,j = 0; at while(J < SubLenght-1){ - if(T < 0 | | Sub.charat (t) = =Sub.charat (j)) { -t++; -J + +; -NEXT[J] = Sub.charat (t)! = Sub.charat (j)?T:next[t]; -}Else { int =Next[t]; - } to } + returnNext; - } the * /* $ *i is the primary string position pointer, J is the sub string position pointerPanax Notoginseng *j<0 The position pointer of the sub string is 0 and sub[0]! = Base[i] - * If the match succeeds, it must be j==sublength. the * */ + Public Static intMATCHOFKMP (String base,string sub) { A intBaselength =base.length (); the intSublength =sub.length (); + inti = 0,j = 0; - int[] Next =Getnextaryext (sub); $ while(I < baselength && J <sublength) { $ if(J < 0 | | Base.charat (i) = =Sub.charat (j)) { -i++; -J + +; the}Else { -j =Next[j];Wuyi } the } - intresult = J = = Sublength?i-j:-1; Wu returnresult; - } About $ Public Static voidMain (string[] args) { - Try { -String base = "Ababghababa"; -String sub = "ABABAP";//Chinchilla,ababaaaba, A intresult =MATCHOFKMP (base, sub); + System.out.println (result); the}Catch(Exception e) { - //Todo:handle Exception $ e.printstacktrace (); the } the } the}
KMP algorithm Practice and simple analysis