Algorithm learning _ character matching algorithm (BF, KMP, BM)

Last Update:2014-03-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The content of this article is from This book is not original except for some of my own understandings and descriptions. Only for notes.

The first step is the definition of the problem. Character matching here refers to continuous substring matching, rather than common subsequences. For example, asdfgge and dfg are matched because asdfgge contains dfg. The problem is to determine whether substring A contains substring B.

I. BF algorithm (brute force solution)

Simply compare the pattern string with the target string. If there is a mismatch, it will be traced back to the first place of the pattern string and re-compare the entire pattern string.

Ii. KMP Algorithm

Compared with BF, KMP is optimized to define the comparison process of failure functions to reuse. Here, the invalid function refers to the moving distance of the mode string when a character does not match.

The theoretical basis of the KMP algorithm is that we can know the situation of the pattern string before performing the comparison.

Iii. BM algorithm

Compared with the KMP algorithm, the BM algorithm is more efficient. Unlike the general matching algorithm, In the BM algorithm, the pattern string is also moved from left to right, but the comparison process is indeed from right to left. The specific theoretical basis of the algorithm is as follows:

A. if yes, set the matched substring to u because B is different from a on the left of the matched substring u, if the same u is found in the remaining unmatched substrings and the left side is not character a, the u is aligned with the u of the target string, and the offset is saved.

B. If the remaining substrings do not reach the substring u That Is Not a on the left, the alignment mode string and the largest substring V of the u will be alignment, and the resulting offset will be saved.

Obviously, only one of the situations A and B can happen at A time.

C. different substring U is the basis for determining the feasible solution, but is based on the mismatch of character B. Check the character in the entire pattern string to see if it contains B. If yes, alignment B, and record the resulting offset.

D. if this mode string does not contain a single B character, it means that the mode string cannot be a child string that contains B, and the mode string is directly moved to the next position of B, and save the generated offset.

Similarly, C and D can only happen once.

Finally, take Max (A or B, C or D) as the final offset. Move the mode string and start a new comparison process. The following are some of my questions.

1. Why is the maximum offset required? In either case, the solution within the offset range cannot be the final solution, that is, only the solution with the largest offset can become the final solution.

2. Do these four situations cover all solutions? The answer is that I don't know either. As I mentioned in the book, it turns out to be very troublesome. For me, it is not necessary to understand the idea of solving problems.

O la ~~~

Reprinted please keep Source: http://blog.csdn.net/u011638883/article/details/20650119

Thank you !!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Algorithm learning _ character matching algorithm (BF, KMP, BM)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Algorithm learning _ character matching algorithm (BF, KMP, BM)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support