The naïve pattern matching algorithm has a large number of repeated matching operations, the time complexity is O (m*n), where m represents the length of the main string, n indicates the length of the pattern string, but the algorithm is good understanding. There is also an efficient algorithm, called KMP, the goal of the algorithm is to remove redundant duplicate matching process, but the algorithm is difficult to understand, mainly through the construction of a next[] array to achieve linear time complexity O (m+n), you can refer to the network is a relatively good blog:
KMP string pattern matching explanation
1 /*KMP pattern matching algorithm2 */3 voidGet_next (Const CharT[],intnext[])4 {5 intj =0, k =-1;6 7next[0] = -1;8 while(T[j]! =' /'){9 if(k = =-1|| T[J] = =T[k]) {Ten++j; ++K; One if(T[j]! =T[k]) { ANEXT[J] =K; -}Else{ -NEXT[J] =Next[k]; the } -}Else{ -K =Next[k]; - } + } - } + A intINDEXKMP (Const CharS[],Const CharT[],intPOS) at { - intindex =0, I = pos, j =0, Len =strlen (T); - int*next = (int*)malloc(Len *sizeof(int));//define a next[] array - -Get_next (T, next);//Analysis T string, get next[] - while(S[i]! =' /'&& T[j]! =' /'){ in if(S[i] = =T[j]) { -++i; ++j;//two characters typeface, etc. continue to}Else{ +Index + = J-Next[j]; - //J return to the appropriate position, I value unchanged the if(Next[j]! =-1){ *j =Next[j]; $}Else{Panax Notoginsengj =0; -++i; the } + } A } the //Free (next); + if(T[j] = =' /'){ - returnIndex//match succeeds, returns the start subscript in the main string $}Else { $ return-1; - } -}
To deepen the understanding of this algorithm, a few reference articles are listed:
Blog Park: KMP algorithm for string matching
Wikipedia: Knus-Morris-Pratt algorithm
CSDN:KMP algorithm Detailed
Blog Park: KMP algorithm Learning and summary
KMP Pattern Matching algorithm