Sunday String Matching algorithm

Source: Internet
Author: User
Tags ord

Visit the ACM God Ben Blog When you see this magical algorithm

KMP, the mismatch function is difficult to understand, the code is a long

BF bar, slow, very slow, especially slow.

BM Bar, I can't write ...

Now see the Sunday algorithm ah, a bright, refreshing.

The efficiency of the string-matching algorithm is presumably dependent on how the next step occurs in the event of a mismatch.

I won't say anything else.

The Sunday algorithm skips as many characters as possible when a mismatch occurs.

assume that s[i]≠t[j],1≤i≤n,1≤j≤m occurs when there is a mismatch. At this point the matched portion is U, and the length of the string U is assumed to be L. 1. Obviously, s[l+i+1] must be in the next round of matches, and T[m] at least move to this position (that is, the pattern string T moves at least one character to the right).

Figure 1 scenario where the Sunday algorithm does not match

There are two cases:

(1) s[l+i+1] does not appear in the pattern string T. The position of the character after the pattern string t[0] moved to s[l+i+1]. 2.

Figure 2 the first 1 cases of Sunday algorithm movement

(2) s[l+i+1] appears in the pattern string. Here s[l+i+1] from the right side of the pattern string T, i.e. press t[m-1], T[m-2] 、... T[0] in order to find. If you find s[l+i+1] is the same as a character in T, note the location, k,1≤k≤m, and t[k]=s[l+i+1]. At this point, you should move the pattern string T to the right by the position of the m-k character, i.e. move to t[k] and s[l+i+1] alignment. 3.

Figure 3 the first 2 cases of Sunday algorithm movement

and so on, if it matches exactly, the match succeeds; otherwise, the next round moves until the far right end of the main string s. The algorithm has a worst-case time complexity of O (n*m). For short-mode string matching problem, the algorithm executes faster.

The Sunday algorithm is similar to the BM algorithm, which focuses on the next character in the text string that matches the last digit of the match when the match fails. If the character does not appear in the matching string, skip directly, that is, move step = match string length +1; otherwise, the same as the BM algorithm, its move step = match the right end of the string to the end of the distance +1. excerpt from Baidu Encyclopedialet's look at an example .
For example we want to find "search" in "substring searching algorithm", at first
the string is aligned to the left of the text,
substring searching algorithm
Search
^
The result finds a mismatch at the second character, so the string moves backwards. But how much does it move? It
is where the various algorithms recount, the simplest way is to move a character position; KMP is the use
already matched parts of the information to move; BM algorithm is to do reverse comparison, and according to the already matched parts to determine
The amount of movement. The method to be introduced here is to look at the character immediately following the current substring (the ' I ' in.
Obviously, no matter how much the move, this character is definitely going to take the next step in the comparison, that is, if the next
one step to match, this character must be within the substring. So, you can move the substring so that the rightmost side of the substring
This character is aligned with it. Now that there is no ' I ' in the substring ' search ', it means you can skip directly over a
large tracts, starting with the character after the ' I ' to make the next comparison, such as:
substring searching algorithm
Search
^
The result of the comparison, the first character does not match, then look at the substring after the character, is ' R ', it in the substring
appears in the penultimate position, so the string moves forward three bits, so that two ' r ' alignment, as follows:
substring searching algorithm
Search
so the match is done.

Another example:
Matching string: O U r S T r O N G X S E A R C HPattern string: S E A R C h Here we see the o-s is not the same, we look at the match in the string o in the position of the pattern string, does not appear in the pattern string. Matched string: o U R S T r O N G X s E a R C H mode String: _ _ _ _ _ _ _ _s E a R c h move pattern string, align the first character of the pattern string with the next character O. Matched string: o U R S T r O N G X S E A R C HMode String: _ _ _ _ _ _ _ _ S e a R C h continue to compare, N/A is not the same, the character R appears in the pattern string, then the pattern string will be aligned to match the string: O U R S T r o N G X s e a RC h Mode String: _ _ _ _ _ _ _ _ _ _ _ S E A RC H
Last code.
vars,check:string; Next:Array[0.. -] ofLongint;functionSunday (s,check:string): Longint;varLen_s,len_c,i,pos,j:longint;beginlen_s:=length (s); Len_c:=length (check);  fori:=1  to  -  DoNext[i]:=len_c+1;  fori:=1  toLen_c DoNext[ord (Check[i])-ord ('a')]:=len_c-i; POS:=1;  whilepos< (len_s-len_c+1) Do        beginI:=POS;  forj:=1  toLen_c Do                begin                    ifS[I]&LT;&GT;CHECK[J] Then                        beginInc (Pos,next[ord (S[pos+len_c])-ord ('a')]);                        Break End;                Inc (I); End; ifJ=len_c Thenexit (POS); End; Exit (-1);End;beginReadln (s);    READLN (check); Writeln (Sunday (S,check));End.
View Code

Sunday String Matching algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.