KMP algorithm for string matching brute force algorithm and string matching

Source: Internet
Author: User
Tags dashed line

Disclaimer: Take a look at Nanyi's blog about the KMP algorithm for strings. The pictures are all referenced in this article.

The problem of string matching was encountered in the previous written examination, and the brain jam did not write the algorithm. Now, analyze and analyze.

The principle of brute force algorithm and KMP algorithm, and the difference between code implementation , and summarizes the general idea of good algorithm .

===========================================================================

respective principle:Brute Force algorithm:

1.


We put the long string as a text string, named Strtext, the short string called the target string, named Strtarget.

Text string "BBC Abcdab Abcdabcdabde" The first character ' B ' with the target string "Abcdabd" of the first character ' A '

The comparison does not produce a match, and throughout the process we assume that the red dashed line is fixed. Therefore, the text string moves to the left one character.

2.


The character ' B ' is not matched with ' A ', and the text string continues to move left.

3.


Until this point, the first match appears, the program moves both the text string and the target string to the left, and the text string is recorded at this time to compare

of the element (that is, the character ' A '). location .

4.

Continue to compare, again a match, continue to move.

5.


At this point, a mismatch occurs, the comparison is reset, according to the previous record ' a ' position text string into the next character ' B ', and the target string of the subscript to start again,

Continue the comparison.

6.


This is the general principle of the Brute force algorithm analysis process.

KMP algorithm:

7.


One basic fact is that when the pod does not match D, you actually know that the first six characters are "Abcdab". The idea of the KMP algorithm is to try to use this known information,

Do not move the "search location" back to the location you have already compared, and continue to move it backwards, thus increasing efficiency.

8.


How do you do that? A partial match table can be calculated for the target string. How this form is produced, and then introduced,

It's just going to work here.

9.


When a known space does not match D, the first six characters "Abcdab" are matched. Table shows that the last matching character B corresponds to the "partial match value" is 2, so according to

The following formula calculates the number of bits that are moved backwards:

Move digits = number of matched characters-corresponding partial match values

6-2 = 4; so move 4 characters.

10.


Moved to here, the space with the character ' C ' does not match, again using the "partial match table" query mobile number of bits. 2-0 = 2; so move 2 characters.

11.


Move again later here, the space with the character ' A ' does not match, again using the "partial match table" query move the number of bits. 1-0 = 1; so move 1 characters.

12.


One by one compared to the D character, found that the character ' C ' of the text string does not match the target string character ' d ', the query moves the number of bits. 6-2 = 4; so move 4 characters.

13.


After the move came to this, a comparison of all found after the match. Target string match succeeded!

14.

About part of the match table


Here's how the partial match table is produced.
First, you need to understand the two concepts: prefix and suffix.

"prefix" means the entire head combination of a string except the last character;

"suffix" means all the trailing combinations of a string except the first character. As shown in.

15.


The partial match value is the length of the longest common element of the prefix and suffix. Take "Abcdabd" as an example,


16.


The essence of "partial match" is that sometimes the string header and tail are duplicated. For example, "Abcdab" has two "AB", then its "partial match value" is 2 ("ab" length).

when the search term moves, the first "AB" moves backwards 4 bits (string length-partially matched values), can come to the second "AB" position.

The analysis was so detailed, as the remark, Talk is cheap, show me the code. Let's write a code!

Code implementation:Brute Force algorithm:
Voidsviolence (const char strtext[], const char strsearch[]) {    int lengthofstrtext, lengthofstrsearch;    int I, J, II;    Lengthofstrtext = strlen (strText);    Lengthofstrsearch = strlen (strsearch);    /*for (i = 0, j = 0, ii = 0; i < Lengthofstrtext && J < Lengthofstrsearch;) *    /for (i = 0, j = 0, ii = 0; j < Lengthofstrsearch;)    {        if (strtext[i] = = strsearch[J])        {            j + +;            i++;            Continue;        }        else        {            i = II;            i++;            II = i;            j = 0;/* Make a clear */            continue;        }    }    if (j = = Lengthofstrsearch)        printf ("existence!");    else        printf ("No existence!");}

The II variable in the program is the record position . Understanding the operation of the algorithm can be very clear that the algorithm has done a lot of repetitive comparison work.

Time complexity Analysis: I see this program feels very (hen) fast (man) Ah, there is only a for loop. Oh, in fact, you do not know, this for loop is different, its end is not only dependent on the for loop condition, but there are cyclic conditions and the i,j,ii of the loop body, and other variables together determined. We remember lengthoftext = m,lengthofsearch = n; so the time complexity of this algorithm is about: T (n) = O (M * N) This level is already what we despise.

KMP algorithm:

KMP algorithm for string matching brute force algorithm and string matching

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.