String Pattern Matching Algorithm

Source: Internet
Author: User

Definition: The primary string s and the sub-string t are set. The sub-string is located to find a sub-string equal to the sub-string T in the primary string S. Generally, the main string S is called the target string, and the substring T is called the pattern string. Therefore, the positioning is also called pattern matching.

Two algorithms are commonly used:

1brute-force Algorithm

Train of Thought: The primary string marks the position of the pointer every time it moves, and then compares it with the child string. How to be equal then returns the position of the current primary string pointer.

Pattern Matching Process

Figure omitted

Disadvantages of the above algorithm: The primary string pointer backtracking. When it matches the pattern string, the primary pointer moves only one position at a time.

2. KMP Algorithm

Idea: analyze the pattern string to avoid unnecessary backtracking. Build mode array.

Process: The pattern string T = "abcabd" is used to store the 'partially Match' information with the next array.

The first character 'a' indicates next [0] =-1;

The second character 'B' indicates next [1] = 0;

The third character 'C', the first character 'B', is not equal to the start of mode t, that is, next [2] = 0;

The fourth character 'a', the front string "BC" and "C", are not matched with the start string of the pattern string T, that is, next [3] = 0;

The fifth character is 'B'. The prefix string "BCA", "BC", and "A" matches the start string of the pattern string T, that is, next [4] = 1;

The sixth character 'D' is the prefix string "bcab", "cab", "AB", and "B". It must start with "AB" and the pattern string T with two equal characters, that is, next [5] = 2;

...

Finally, during calculation, after each matching with the primary string, the primary string continues matching directly from the unmatched pointer. The pattern string jumps to the position pointed to by the next number to avoid backtracking.

Matching Process

Figure omitted

Incomplete process: If S = "aaabaaaab", t = "aaaab" appears, the main string stays at the fourth character three times, the next array of the mode string changes from 3-> 2-> 1-> 0,

Process

Figure omitted

Improvement Method: If S = "aaabaaaab", t = "aaaab" appears, the 1, 2, and 3 characters in the mode are equal to the 4th characters, so you do not need to compare them with the fourth character, however, you can directly compare the characters I = 4, j = 0 when the mode slides to the right to the position of 4th characters.

Process

The final algorithm is as follows:

Public class KMP {private final static int maxsize = 100; Private Static int [] nextval = new int [maxsize]; // next array/* @ function build mode string next array */Private Static void getnext (string s) {int Len = S. length (); char [] CH = S. tochararray (); Int J = 0, K =-1; nextval [0] =-1; while (j <len-1) {If (k =-1 | ch [J] = CH [k]) {J ++; k ++; If (CH [J]! = CH [k]) nextval [J] = K; else nextval [J] = nextval [k];} else {k = nextval [k] ;}} /* @ function calculates the position of the character matching string */Private Static int kmpindex (string S, string t) {int I = 0, j = 0; int SL = S. length (); char [] SC = S. tochararray (); int TL = T. length (); char [] Tc = T. tochararray (); While (I <SL & J <TL) {If (j =-1 | SC [I] = tc [J]) {I ++; j ++;} else {J = nextval [J] ;}} if (j> = TL) Return (I-Tl ); else return-1;} public static void main (string [] ARGs) {string S = "abcaabbabcabaacbacba"; string T = "abcabaa"; getnext (t ); int I = kmpindex (S, T); system. out. println (I );}}
Conclusion: It's easy to understand.

String Pattern Matching Algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.