Pattern Matching-BM algorithm

Source: Internet
Author: User

■ Boyer-moore(BM) algorithm

the boyer-moore algorithm developed in 1977 is a pattern string matching algorithm based on suffix matching, and the suffix matching is that the pattern string is compared from right to left, but the pattern string moves from left to right. To achieve a faster mobile mode string,BM defines two rules: bad character rules and good suffix rules

Bad characters (mismatched characters) rules

1.if the bad character C does not appear in the pattern string p, move the pattern string p directly to the next character of the bad character C .

2, if the bad character C appears in the pattern string p , then the pattern string p closest to the good suffix of the bad character (of course, this implementation is a bit cumbersome) with the bad character of the parent string alignment:

Good suffix (all trailing string) rules

1, the pattern string has substring matching the good suffix, at this time move the pattern string, let the substring and the best suffix alignment, if more than one substring match the good suffix, then select the most close to the best suffix of the substring alignment.

2, the pattern string does not have substring matching suffix, at this time need to look for a longest prefix of the pattern string, and let the prefix equal to the suffix of good suffix, look for the prefix, let the prefix and good suffix alignment.

in fact,1 and 2 can be seen as the pattern string also contains good suffix string (good suffix substring is also a good suffix).

3. There is no substring match suffix on the pattern string, and the longest prefix is not found in the pattern string, so the prefix equals the suffix of the good suffix. At this point, move the mode directly to the next character of the good suffix.

These two rules calculate the length of the pattern string we can move backwards, and then select the two rules to move the large, as we really move the distance.

The least time complexity of the algorithm is O (MN), preferably O (n/m), where n is the length of the female string, andm is the length of the pattern string.

code implemented in Java.

http://www.oschina.net/code/snippet_660460_48329


Pattern Matching-BM algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.