■ Boyer-moore(BM) algorithm
the boyer-moore algorithm developed in 1977 is a pattern string matching algorithm based on suffix matching, and the suffix matching is that the pattern string is compared from right to left, but the pattern string moves from left to right. To achieve a faster mobile mode string,BM defines two rules: bad character rules and good suffix rules
Bad characters (mismatched characters) rules
1.if the bad character C does not appear in the pattern string p, move the pattern string p directly to the next character of the bad character C .
2, if the bad character C appears in the pattern string p , then the pattern string p closest to the good suffix of the bad character (of course, this implementation is a bit cumbersome) with the bad character of the parent string alignment:
Good suffix (all trailing string) rules
1, the pattern string has substring matching the good suffix, at this time move the pattern string, let the substring and the best suffix alignment, if more than one substring match the good suffix, then select the most close to the best suffix of the substring alignment.
2, the pattern string does not have substring matching suffix, at this time need to look for a longest prefix of the pattern string, and let the prefix equal to the suffix of good suffix, look for the prefix, let the prefix and good suffix alignment.
in fact,1 and 2 can be seen as the pattern string also contains good suffix string (good suffix substring is also a good suffix).
3. There is no substring match suffix on the pattern string, and the longest prefix is not found in the pattern string, so the prefix equals the suffix of the good suffix. At this point, move the mode directly to the next character of the good suffix.
These two rules calculate the length of the pattern string we can move backwards, and then select the two rules to move the large, as we really move the distance.
The least time complexity of the algorithm is O (MN), preferably O (n/m), where n is the length of the female string, andm is the length of the pattern string.
code implemented in Java.
http://www.oschina.net/code/snippet_660460_48329
Pattern Matching-BM algorithm