The next array of KMP algorithms is detailed

Source: Internet
Author: User

Please specify the source and include the relevant links.

There are a lot of blogs on the internet that explain the KMP algorithm, and I don't waste time writing another copy. Directly recommend a blog that I started to read:
Http://www.cnblogs.com/yjiyjige/p/3263858.html
The classmate explained the KMP algorithm with detailed graphic mode, which is very suitable for getting started.
----------------------------------------------------------------------------------------------

The next array of KMP is not easy to figure out and is the most important part. I this article with my own sentiment to slowly deduce it! Make sure you know it after you read it and know why.

If you don't know what KMP is, please read the above link first to understand what KMP is going to do.
Now let's say KMP's next array method.
The next array of KMP simply, suppose there are two strings, one is the string to be matched strtext, and the other is the keyword strkey to look for. Now we're going to look in the strtext for the inclusion of strkey, using I to indicate which character strtext traversed, and J to indicate which character the strkey matches.
In the case of a brute-force lookup, when the strtext[i] and strkey[j] matches fail, I and J are rolled back and then re-matched from the next character in the I-j.
And KMP is to ensure I never return, only back J to make the matching efficiency has improved. The method is to find out where J should fall back by using the characteristics of the strkey in mismatch J for the successful matching substring. And this substring is characterized by the same degree of prefix.
So the next array is actually looking at how many bits match the prefix of each of the preceding substrings in the strkey, which determines where the J mismatch should fall back to.

I know the above piece of crap is difficult to understand, let's look at a color map:

This picture is strkey the keyword string to look for. Let's say we have an empty next array, and our job is to fill the next array with values.
Here we use mathematical induction to solve the problem of the value of the fill.
Here we draw on the three steps of mathematical induction (or dynamic programming?). ):
1. Initial state
2, assuming that the first J-bit and the first J-bit before we have completed
3, the deduction of the j+1 bit how to fill

Initial state We'll talk about it later, and we'll just assume that we're done with the first J-bit and the first J-bit. In other words, we have the following known conditions from the point of view:
Next[j] = = k;
NEXT[K] = = The index where the green color block is located;
next[the index where the green color block is located] = = The index of the yellow color block;
Here's a note: the size of the color block on the graph is the same (no kidding me?) OK, ignore the color block size, the color block is just one of the arrays.

Let's take a look at the following diagram to get more information:

1. By "next[j" = = k; With this condition, we can get the A1 substring = = A2 substring (according to the definition of next array, prefix that).

2. Index of "next[k] = = green color block;" With this condition, we can get the B1 substring = = B2 substring .

3. The index where the "next[green color block is located" = = = the yellow color block; With this condition, we can get the C1 substring = = C2 substring .

4. From 1 and 2 (A1 = = A2,B1 = = B2) can get B1 = = B2 = B3.

5. From 2 and 3 (B1 = = B2, C1 = = C2) can get C1 = = C2 =C3.

6.B2 = = B3 can get C3 = = C4 = = C1 = C2

The above is a very simple geometric mathematics, a closer look can read. Here I use lines of the same color to represent exactly the same sub-arrays for easy observation.

Next, we start with the conditions above to deduce if the j+1 bit mismatch, we should fill in next[j+1] how much?

Next[j+1] that is to find strkey from 0 to J the maximum prefix of this substring:

#: (#: Here is a mark, the back will use) we know A1 = = A2, then A1 and A2 respectively add a character later whether it is still equal? Our scoring situation is discussed:

(1) if str[k] = = Str[j], it is clear that our next[j+1] is directly equal to k+1.

Write in code is next[++j] = ++k;

(2) if str[k]! = Str[j], then we can only from known, except A1,a2, the longest b1,b3 this prefix to make a fuss.

So B1 and B3 add one more character to each other, and are they still equal?

Because next[k] = = Green color Block index, we first let k = Next[k], put K to the green color block position, so we can recursively call "#:" The logic at the mark.

Since the next array before the J+1 bit is assumed to have been obtained, the above recursion will always end, thus getting the value of next[j+1].

The only thing we lack is the initial conditions:

Next[0] =-1, k =-1, j = 0

Another special case is K-1, can not continue to recursion, at this time next[j+1] should be equal to 0, that is, J back to the first.

i.e. next[j+1] = 0; can also be written next[++j] = ++k;

public static int[] GetNext (String PS) {    char[] strkey = Ps.tochararray ();    int[] Next = new Int[strkey.length];    Initial conditions    Int j = 0;    int k =-1;    Next[0] =-1;     Based on known pre-J bits, the j+1 bit while    (J < strkey.length-1)    {        if (k = =-1 | | strkey[j] = = Strkey[k])        {            Next [++j] = ++k;        }        else        {            k = next[k];        }    }     return next;}

Now look at this piece of code and there should be no problem.

Optimization:

The attentive friend should have discovered, above has such a sentence:

(1) if str[k] = = Str[j], it is clear that our next[j+1] is directly equal to k+1. Write in code is next[++j] = ++k;

But we know that the first j+1 is mismatch, and if we go back to J, we find that the new J (that is, the ++k at this time) is equal to the J before the rollback, and must also be a mismatch. So we have to go back.

public static int[] GetNext (String PS) {    char[] strkey = Ps.tochararray ();    int[] Next = new Int[strkey.length];    Initial conditions    Int j = 0;    int k =-1;    Next[0] =-1;     The j+1 bit while    (J < strkey.length-1)    {        if (k = =-1 | | strkey[j] = = Strkey[k]) is inferred from the known first J bits        {            //AS Fruit Str[j + 1] = = str[k + 1], back still mismatch, so to continue to fallback            if (str[j + 1] = = str[k + 1])            {                next[++j] = next[++k];            }            Else            {                Next[++j] = ++k;            }        }        else        {            k = next[k];        }    }     return next;}

All right, since this KMP's next approach is all explained. You are welcome to point out the error of the article, I better perfect it.

----------------------------------------------------------------------------------------------------------

Let's talk about the interview, give a string, want you to write its next array, how to write:

①: The length of the maximum prefix string is calculated for each left substring, as the initial next array

②: Because I is required to move the first bit mismatch, it is assigned a value of-1

③:P[3] = = A, next[3] = = 0, p[0] = = A; So p[3] = = P[0], (move past or mismatch, need to continue to move), optimize next[3] for next[0], that is-1

④: Similarly optimized next[10] for next[0], i.e.-1

⑤: The same as optimization p[14],p[15],p[16]

The next array of KMP algorithms is detailed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.