String Matching Sunday algorithm-performance exceeds the KMP and BM Algorithms

Source: Internet
Author: User
Tags string find

The first time I heard about the Sunday algorithm, I said it was a big pie. In his illustrated explanations, I found that this algorithm is an easy-to-understand algorithm that is more efficient than KMP and BM.

Fewer Sunday moves!

 

So I tried to write one. If it was really good, share it.

First of all:

The Sunday algorithm is a faster algorithm proposed by Daniel M. Sunday in 1990 than the BM algorithm. The core idea is: During the matching process, the pattern string is not required to be compared from left to right or from right to left. When a mismatch is found, the algorithm can skip as many characters as possible to perform the next matching, thus improving the matching efficiency.

Assume that s [I] ≈ T [J], 1 ≤ I ≤ n, 1 ≤ j ≤ m in case of mismatch. At this time, the matched part is U, and the length of the string U is assumed to be L. 1. Obviously, s [L + I + 1] must participate in the next round of matching, and t [m] should at least move to this position (that is, the mode string T should move at least one character to the right ).

 

 

Figure 1 Sunday algorithm Mismatch
There are two situations:
(1) s [L + I + 1] does not appear in the mode string T. At this time, the mode string T [0] is moved to the character position after s [T + I + 1. 2.

Figure 2 1st cases of Sunday algorithm moving
(2) S [L + I + 1] appears in the mode string. Here s [L + I + 1] from the right side of the pattern string T, that is, by T [M-1], t [M-2],… T [0. If it is found that s [L + I + 1] is the same as a character in T, write down this position as K, 1 ≤ k ≤ m, T [k] = s [L + I + 1]. In this case, the pattern string T should be moved to the right M-K character position, that is, to the T [k] And s [L + I + 1] Alignment position. 3.


Figure 3 2nd cases of Sunday algorithm moving
And so on. If the match is complete, the match is successful. Otherwise, move the next round until the rightmost end of the Main string s ends. The worst case of this algorithm is O (n * m ). This algorithm is faster to match short mode strings.

Sunday algorithm Java code:

Package math; </P> <p> Import Java. util. arraylist; </P> <p> Import com.sun.org. apache. bcel. internal. generic. ifnonnull; </P> <p> public class Sunday <br/> {<br/> Public Sunday () {<br/> // <br/>}</P> <p> Public arraylist qfindchr (string STR, string Sfind) {<br/> int str_length = 0; <br/> int fin_length = 0; </P> <p> int find_count = 0; <br/> int start = 0; <br/> int movenum = 0; <br/> arraylist <integer> Index = new arraylist <integer> (); </P> <p> If (Sfind. length () = 0 | Str. length () = 0) {<br/> return NULL; <br/>}</P> <p> If (Str. length () <Sfind. length () {<br/> return NULL; <br/>}</P> <p> str_length = Str. length (); <br/> fin_length = Sfind. length (); </P> <p> while (start + fin_length <= str_length) <br/>{< br/> movenum ++; <br/> Boolean isfind = false; // whether to find it in this move <br/> string s_temp = Str. substring (start, start + fin_length); <br/> If (Sfind. equals (s_temp) {<br/> index. add (start); // record the coordinates of the first character at the position of the matching character <br/> find_count ++; <br/> Start = start + fin_length; <br/> isfind = true; <br/>}< br/> If (isfind = false) // if the next Moving position of the calculation is not found <br/>{< br/> int forwardpos = qfindpos (STR, Sfind, start, fin_length); <br/> Start = forwardpos; <br/>}< br/> system. out. println ("move_count =" + movenum); // The number of moves decreases significantly <br/> returnindex; <br/>}</P> <p> // locate the position of the character string (not the last character) (reciprocal) <br/> // The fin_length is not found, locate the return position <br/> // locate the position (reciprocal) of the character string (not counted from the last character); no return Str. length, locate the return position <br/> Public int qfindpos (string STR, string find, int POs, int fin_length) {<br/> int returnpos = Str. length (); <br/> char [] Schr = Str. tochararray (); <br/> char [] sfin = find. tochararray (); </P> <p> If (Pos + fin_length) <Str. length () <br/>{< br/> char chrfind = Schr [POS + fin_length]; // the character to be searched <br/> If (fin_length> = 1) <br/>{< br/> If (find. lastindexof (chrfind)>-1) // If the find contains the chrfind character <br/>{< br/> returnpos = POS + fin_length-find. lastindexof (chrfind); <br/>}< br/> else {// If the chrfind character does not exist in the find statement <br/> returnpos = POS + fin_length + 1; <br/>}< br/> return returnpos; <br/>}</P> <p> Public static void main (string [] ARGs) {<br/> string STR = "return11chrfind1pos = chrfind1 POS + fin_length-returnpos = POS + fin_length-returnpos" <br/> + "= POS + fin_length-returnpos = POS + fin_length -returnpos = POS + fin_length-returnpos = POS "<br/> +" + fin_length-returnpos = POS + fin_length "<br /> + "-returnpos = POS + fin_length-find. lastindexof (chrfind ); returnpos = POS "<br/> +" + fin_length-returnpos = POS + fin_length "<br/> +"-returnpos = pos + fin_length-returnpos = POS "<br/> +" + fin_length-returnpos = POS + fin_length -returnpos = POS + fin_length-"<br/> +" Find. lastindexof (chrfind ); returnpos = POS + fin_length-returnpos = POS + "<br/> +" fin_length-returnpos = POS + fin_length-returnpos = POS + fin_length "<br/> +"-returnpos = POS + fin_length-returnpos = POS "<br/> +" + fin_length -returnpos = POS + fin_length-find. lastindexof (chrfind1) "; <br/> string find =" chrfind1 "; <br/> Sunday = new Sunday (); </P> <p> arraylist <integer> Index = Sunday. qfindchr (STR, find); <br/> If (Index = NULL) {<br/> system. out. println ("condition not met"); <br/> return; <br/>}< br/> system. out. println ("Count =" + index. size (); <br/> for (INT I = 0; I <index. size (); I ++) {<br/> system. out. print ("index:" + index. get (I) + ","); <br/>}< br/>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.