The z algorithm of "algorithm" string matching

Source: Internet
Author: User

For a long time, I asked for text to match the single-mode string, and I only used KMP. Later I saw the z algorithm on CF, with a lot of people. After studying, I feel that the z algorithm is also very subtle. In previous blog post, there was also the problem of solving string matches with the z algorithm.

The z algorithm is described below.

Let's start with a word about what the z algorithm can ask for.

The input for a string s,z algorithm can be obtained for each suffix of this string with its own longest common prefix LCP.

Next, the specific content of the z algorithm is introduced.

The length of the memory string s is N.

The z algorithm needs to maintain a pair of values, recorded as left and right, and précis-writers as L and R. L and R satisfy S[l,r] are prefixed with s string. When I is 1, the violence compared s[0,n-1] and s[1,n-1] can be obtained at this time L and R, but also got z[1], namely suffix (1) and S itself LCP.

Assuming the calculation to i-1, we have obtained the current L and R, and also got the value of z[1] to z[i-1], now we need to calculate z[i] with the new L and R.

1. Assuming i>r, it means that there is no string ending after I or I, and that the string itself is a prefix of s, otherwise R should not be less than I. For this situation, it is necessary to recalculate the new L and R, to make the l=r=i, the violent comparison s with suffix (i), to get the z[i]=r-i+1=r-l+1.

2. At this time i<=r, make k=i-l, can assert Z[i]>=min (z[k],r-i+1). Because we can use L to R as a prefix to the string, I have an offset of k relative to L, because of the meaning of L and R.

If z[k]<r-i+1, then z[i] must be equal to Z[k], based on this time, s[k,k+z[k]-1] is a prefix of s[i,r], and in this case L and R do not change.

If z[k]>=r-i+1, according to the meaning of R, s[r+1]!=s[r-l+1],z[k] is greater than r-i+1 match information because s[r+1]!=s[r-l+1] and invalid, so at this time according to Z[k] can assert z[i] at least r-i+1, so L =i, calculate the new R value, and get the z[i at this time].

In the concrete implementation, the second case of the two seed situation can be normalized processing.

Give a C + + implementation code:

1 voidZ (Char*s,intn=0) {2N= (n==0)?strlen (s): N;3z[0]=N;4     intL=0, r=0;5      for(intI=1; i<n;i++) {6         if(i>r) {7L=i,r=i;8              while(R<n&&s[r-i]==s[r]) r++;9z[i]=r-l;Tenr--; One         } A         Else { -             intk=i-l; -             if(z[k]<r-i+1) thez[i]=Z[k]; -             Else { -L=i; -                  while(R<n&&s[r-i]==s[r]) r++; +z[i]=r-l; -r--; +             } A         } at     } -}
View Code

The z algorithm solves the single pattern string matching method very simply, makes S is the text string, T is the pattern string, constructs the new string p=t+ ' # ' +s, computes the Z-array, scans backwards from the position where s begins in P, and if Z[i]=length (s), there is a match here. Of course, you can not add ' # ', then the judgment needs to use >= instead of =.

The z algorithm of "algorithm" string matching

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.