Efficient interview string matching (KMP,AC algorithm) __ algorithm

Source: Internet
Author: User
The text TN mode PM, where p appears in t where the offset string match problem is described as: Find all offsets S (0=<s<=n-m) so that p is the suffix of ts+m. Two-step completion, preprocessing + matching
Algorithm Preprocessing time Match time
Naive algorithm O O ((n-m+1) m)
RK algorithm O (M) O ((n-m+1) m)
Finite state machine O (m|∑|) O (N)
Kmp O (M) O (N)
1. Naive string pattern for s=0 to n-m ifp[1...m]==t[s+1...s+m] print "Find offset S"

2.KMP Difficulty: Find the prefix of the string matching the next array
http://blog.csdn.net/yearn520/article/details/6729426
Http://www.cnblogs.com/c-cloud/p/3224788.html
http://blog.csdn.net/youxin2012/article/details/17083261
Characteristics:

It can effectively skip several characters in the back face, avoid unnecessary backtracking, and speed up the matching in the case of mismatch in the matching process.

1.next arrays

The next array is used to illustrate the symmetry of the string to be matched, the maximum common prefix

A b c d a b D

next:0 0 0 0 1 2 0 The maximum common prefix for string A is 0,a b c D A and the largest public prefix is AB and the length is 2 optimized:voidGet_next (CharStr[],intNintNext[]) {inti = 0; Next[0] = 0; for(i = 1; i < n; i++) {if(Str[i] = = Str[next[i-1]]) next[i] = next[i-1] + 1;ElseNext[i] = 0; }} Explanation: Next[4] represents the maximum common prefix for a string with a length of 4. If STR[NEXT[4]] and str[5] are equal at this point, you know next[5]=next[4]+1. A b c d a B
Next[4]=1 is B.
When it does not match, move the search word backwards (the number of characters matched-the corresponding next value), try to take advantage of this known information, do not move the "search location" back to the location that has been compared, and continue to move it backwards, which improves efficiency.

When a known space does not match D, the first six characters "Abcdab" are matched. The table shows that the last matching character B corresponds to the "partial match value" is 2, that is, the preceding 2 bytes is no longer to match. Next match: The matched AB is no longer matched.


KMP algorithm: Src_len source string Dst_len string to be matched while((I<src_len) && (J<dst_len)) {if(Src_string[i] = = Dst_string[j])               {i++;           j + +; }Else{if(j = = 0) {i++;//source string before table move}Else{m = j-dst_next[j-1];//number of bits to backtrack j = j-m;//Set the next time the starting coordinates}} }
3.AC multimode matching algorithm look at the following example: Given 5 words: Say she shr and he her, then given a string yasherhs. Ask how many words have appeared in this string.
Three steps: Build Trik tree, add failure path to Trik tree, set up AC automaton, search text according to AC machine
1. Building the Trik Tree

1 const int kind = 26;
2 struct node{
3 node *fail; Failed pointer
4 node *next[kind]; Tire child nodes per node (maximum number of letters)
5 int count; Is the last node of the word
6 node () {//constructor initialization
7 Fail=null;
8 count=0;
9 memset (next,null,sizeof (next));
10}
}*Q[500001]; Queue, convenient for BFS construction failure pointers
Char keyword[51]; The word entered
-Char str[1000001]; Pattern string
Head,tail int; The tail-to-pointer of a queue
1 void Buildingtree (char * str,node * root) {
2 Node * p = root;
3 int i = 0, index;
4 while (Str[i]) {
5 index = str[i]-' a ';
6 if (P-next[index] = = NULL) P--next[index] = new node ();
7 p = P-next[index];
8 i + +;
9}
Count + + + +; At the last node of the word count+1,

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.