String Matching Algorithm-BM algorithm

Last Update:2014-03-15 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The BM algorithm is an improvement of the KMP algorithm, which is 3-bit faster than the KMP algorithm ~ 5 times

The BM algorithm mainly follows two principles: 1. Bad characters, 2. Good suffixes.

Assume that the main string S exists. The length is s_l, and the pattern string T is t_l.

1. Bad characters

If there is a character in the main string S and this character does not exist in the mode string T, the mode string shifts the t_l bit to the right. (Because, if such a character exists, you can find it during the first match, so you need to move the t_l distance to the right)

If there is a character in the main string S, but this character is not in the current position of the mode string T, you can move the mode string so that the rightmost character in the mode string T, alignment with main string S

Therefore:

T_l x! = T [j], 1 <= j <= t_l indicates that the character does not exist in the mode string.

Deltal (x) =

M-max (k/T [k] = x, 1 <= k <m), x in the rightmost position of the Mode

Code:

Void bminitocc () // judge the bad character function {int I; for (I = 0; I <t_length; I ++) occ [t [I] = I ;}

2. suffix

Sometimes bad characters may fail. Alignment the rightmost match of the pattern symbol to the corresponding character of the Main string, which may lead to a negative shift. However, it is feasible to move a position, but in this case, it is better to derive the maximum possible shift distance from the structure of the mode string, which is called suffix inspiration.

There are two scenarios for a good Suffix:

A substring In The Middle Of T is equal to the compared part.

<喎?http: www.bkjia.com kf ware vc " target="_blank" class="keylink"> VcD4KPHA + MqGiPC9wPgo8cD48aW1nIHNyYz0 = "" alt = "\">

T has the same suffix as T.

In the above two cases, we take the smallest shift distance for the distance to be moved, because we need to ensure that each of the existing possibilities is compared.

Question 1: Now, let's look at a problem. When a comparison is performed, the comparison part has the same prefix, and there is a completely identical part in the other position of T. At this time, we can find that the first case (that is, a substring In The Middle Of T is equal to the part already compared) the shift distance is shorter. Therefore, we can determine that if both cases exist, we only need to take the distance from the first case, because this is definitely shorter than the second case.

Question 2: when we know the suffix at the beginning of the pattern string, then we can know when the second problem occurs? Http://www.bkjia.com/kf/yidong/wp/ "target =" _ blank "class =" keylink "> signature + cda-vcd4kpha + uPm + 3 cnPw + a1xMG91tbH6b/Signature + m/Signature + 1tbH6b/Signature + cda-vcd4kpha + signature + Signature + ZiBbXSAgtOa0osO/uPbOu9bDtcS6w7rz17o8L3A + 5E + c?vcd4kpha + PHByZSBjbGFzcz0 = "brush: java; "> void BMP re Process1 () // store all the locations with good suffixes {int I = t_length, j = t_length + 1; f [I] = j; while (I> = 0) {while (j <= t_length & t [I-1]! = T [J-1]) {if (next [j] =-1) next [j] = j-I; // when there is a good suffix, the right shift position j = f [j];} I --; j --; f [I] = j;} // cout <

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

String Matching Algorithm-BM algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

String Matching Algorithm-BM algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support