After reading this, we can see that the KMP algorithm is messy ~~

Source: Internet
Author: User

Http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm

Http://www.ics.uci.edu /~ Eppstein/161/960227 .html

T is the main string ababaaabaaababaa, the size is 18; P is the pattern string ababaa, the size is 6;

Let's take a look at the common core code of the string matching algorithm:

For (I = 0; t [I]! = '\ 0'; I ++) {for (j = 0; t [I + J]! = '\ 0' & P [J]! = '\ 0' & T [I + J] = P [J]; j ++); // There is a semicolon. If (P [J] = '\ 0') {report (I); // a matched one }}

The main string index of the External Loop is I ++ each time, a total of 18 times, all the way to the end. The index J of the pattern string in each cycle has a variable range. The worst is 6 ++ operations. Worst case of algorithms 18*6

KMP algorithm:

First, let's look at several concepts:

Prefix string:

A prefixXIs a substringUWithU=X0...XB-1
WhereB{0 ,...,K} I. e.XStarts
WithU.

Suffix string:

A suffixXIs a substringUWithU=XK-B...XK-1
WhereB{0 ,...,K} I. e.XEnds
WithU.

True prefix string and true suffix string:

A prefixUOfXOr a suffixUOfXIs
Called a proper prefix or suffix, respectively, ifUX,
I. e. If its lengthBIs lessK.

Boundary string:

A borderXIs a substringRWithR=X0...XB-1
AndR=XK-B...XK-1
WhereB{0 ,...,K-1}

Note that the boundary string must satisfy both the real prefix string and the real suffix string to be called the boundary string !! In general, the boundary string is the combination of two equal prefix and suffix strings.


Both R and S are boundary strings!

Boundary string extension:

LetXBe
A string andAAA
Symbol. A borderROfXCan
Be extendedA,
If
RAIs
A borderXA.

The r boundary string in is extended to the RA boundary string.

In the following code:

Array B [] is the border value of each element in the pattern string we have obtained-that is, the length of the boundary string.

Well, let's start from the beginning. I don't know the day when Grandpa Gartner complained that the efficiency of the regular string matching algorithm was too bad. I wanted to improve him. How can I improve him?

In essence, it is to reduce the number of internal and external loops in the regular string matching algorithm, which uses the boundary string!

In the following code, we can see that the outer loop index is still ++ 18 times.

The number of loops in the inner layer is changed. As for how to reduce the number, you can understand it! We also use a border value! The border value is the next value on the Internet!

Then, how can we find this border value? Skip to the next section.

Void kmpsearch () {int I = 0, j = 0; // I is the primary index, and J is the pattern index for (; I <18;) // you see, the matching algorithm is similar to the regular string match algorithm. :-) {for (; j> = 0 & T [I]! = P [J];) J = B [J]; // This improves the efficiency I ++; j ++; If (j = m) {report (I-j); // found. I-j is the initial position of the mode string in the main string. J = B [J] ;}}
 

Core Component of KMP computing: preprocessing algorithm

After reading the following sentence, you can understand border.

The preprocessing algorithm comprises a loop with a variableJAssuming
These values. A border of widthJCan be extendedPI,
IfPJ=PI.
If not, the next-widest border is examined by settingJ=B[J].
The loop terminates at the latest if no border can be extended (J=-1 ).

Void kmppreprocess () // evaluate the border value of each element in the mode string. {Int I = 0, j =-1; // I is the index of the mode string, and J is the Border Value of the element currently processed in the I mode string. B [I] = J; // B [] is a global array while (I <m) {While (j> = 0 & P [I]! = P [J]) J = B [J]; // This sentence is the core. Please give an example and draw a picture so that you can understand and cooperate with the above English explanation. I ++; j ++; B [I] = J ;}}

 
 
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.