String matching---brute force matching algorithm

Source: Internet
Author: User
Tags string back

Suppose we now face the problem of having a text string s, and a pattern string p, now to find the position of P in S , how to find it?

First, the process of the brute-force matching algorithm and its inherent logic are understood first:

If the idea of a violent match, and assuming that the text string s matches now to the I position, the pattern string P matches to the J position, there are:

    • If the current character matches successfully (that is, s[i] = = P[j]), then i++,j++ continues to match the next character;
    • If the mismatch (that is, a s[i]! = P[j]), so i = i-(j-1), j = 0. The equivalent of every match failure,I backtracking, J is set to 0.

For example, if given a text string s: "BBC Abcdab Abcdabcdabde", and the pattern string p: "Abcdabd", now to take the pattern string p to match the text string s, the whole process is as follows:

1. s[0] for b,p[0] is a, does not match, executes the ② directive: "If mismatch (that is, s[i]! = P[j]), so i = i-(j-1), j = 0 ", then the judgment s[1] matches p[0], equivalent to the pattern string to move to the right one (i=1,j=0)

2. s[1] and p[0] still do not match, continue to execute the ② directive: "If mismatch (that is, s[i]! = P[j]), so i = i-(j-1), j = 0 ", then the judgment s[2] and p[0] whether the match (i=2,j=0), so that the pattern string constantly moving to the right one (continuous execution" i = i-(j-1), j = 0 ", I from 2 to 4,j has been 0)

3. Until s[4] matches the p[0] (i=4,j=0), at this point in accordance with the above-mentioned brute force matching algorithm, instead of the ① instruction: "If the current character matching success (ie s[i] = = P[j]), then i++,j++", can get s[i] for s[5], P[J] is p[1], that is, the next judgment s[5] and p[1] Match (i=5,j=1)

4. s[5] Match p[1] successfully, continue to execute the ① directive: "If the current character matching success (ie s[i] = = P[j]), then i++,j++", Get S[6 "and P[2] match also succeeded (i=6,j=2), so go on

5. Until S[10] is a space character, P[6] is the character D (i=10,j=6), because it does not match, re-executes the ② directive: "If mismatch (that is, s[i]! = P[j]), so i = i-(j-1), j = 0 ", at this time, i=5,j=0, equivalent to judging s[5] and p[0] whether the match

6. So far, we can see, if according to the idea of the brute force matching algorithm, although the previous text string and pattern string has been matched to s[9], p[5], but because s[10] and p[6] does not match, so the text string to go back to s[5], pattern string back to p[0], So that the text string is then started to match from S[5] to the p[0 of the pattern string). The next matching process is nothing more than a similar logical idea until a matching string or text string traversal is found to exit.

Java code implements brute force matching string

/*** The brute force matching string algorithm * idea: *① if the current character match succeeds (i.e. s[i] = = P[j]), then i++,j++ *② if mismatch (i.e. s[i]! = P[j]), i = i-(j-1), j = 0     . equivalent to each match failure, I backtracking, J is set to 0. * Time complexity is O (MN) (M, n is the length of the text string and pattern string, respectively).     No need to expand storage space. * @paramtext String *@paramPattern Mode String *@returnpattern returns the position in text*/     Public Static intBruteforcesearchpatternintext (String text,string pattern) {intSlen =text.length (); intPlen =pattern.length (); Char[] s =Text.tochararray (); Char[] p =Pattern.tochararray ();  while(Slen <Plen) {            return-1; }                inti = 0 ; intj = 0 ;  while(I < Slen && J <Plen) {            if(S[i] = =P[j]) {                //if the current character matches successfully (that is, s[i] = = P[j]), then i++,j++i = i+1; J= J+1; }Else{                //if mismatch (i.e. s[i]! = P[j]), make i = i-(j-1), j = 0i = i-(j-1); J= 0; }        }        //The match succeeds, returns the position of the pattern string p in the text string s, otherwise returns-1        if(J = =Plen) {            returnI-J; }Else{            return-1; }            }    

Concluding remarks: above the algorithm analysis process, the 6th step, we will find S[5] must be mismatch with p[0. Why is it? Because in the previous 4th match, we have learned that s[5] = p[1] = B, and p[0] = A, that is, p[1]! = p[0], so s[5] must not be equal to p[0], so backtracking in the past will inevitably lead to mismatch. Is there an algorithm that allows I not to go back, just move J? The answer is yes. This algorithm is the KMP algorithm , it uses the previously already partially matched this valid information, maintains I does not backtrack, by modifies the position of J, lets the pattern string move to the valid position as far as possible.

Organized from: http://blog.csdn.net/v_july_v/article/details/7041827

String matching---brute force matching algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.