Learning notes----suffix arrays

Source: Internet
Author: User

Study Materials: IOI2009 National Training Team paper--"suffix array"

The paper is relatively clear, but there is no explanation in the code, but also from the Internet to find a code comments, explained very good

Address: http://www.cnblogs.com/Lyush/p/3233573.html

Here is the code template:

The multiplication algorithm achieves high efficiency.

const int MAXN = 10010;int WA[MAXN], WB[MAXN], WV[MAXN], ws1[maxn];int cmp (int *r, int A, int b, int l) {return r[a]==r [B]&&r[a+l]==r[b+l];}    void da (int *r,int *sa,int n,int m) {int I, J, p, *x = Wa,*y = WB; The following four lines are a cardinal sort of the first letter: the base sort is actually the number of positions in front of the record that were occupied for (i = 0; i<m; i++) ws1[i]=0; Empties an array of statistical character numbers for (i = 0; i<n; i++) ws1[x[i]=r[i]]++; Count the number of characters for (i = 1; i<m; i++) ws1[i]+=ws1[i-1]; A summation is made because the preceding small character set has a position contribution to the trailing character for (i = n-1; i>=0; i--) sa[--ws1[x[i]]]=i; According to the position to sort, sa[x] = I, indicating I position in the first X//wa[x[i]] is the character set 0-x[i] Total number of characters occupy the position, minus one of their own position left is their own rankings, ranking starting from 0//ranking process is the main process for the same character The ordering of the characters, because Change Wa[x[i]] is worth only itself, less than the contribution of the character//is constant, the same basis for the first character is the positional relationship, and in the following will be seen by the second keyword to determine the relationship of the same character//        This later sort is determined by two keywords to determine the position of a string, that is, the multiplication of ideas//by splitting a string into two parts, and the position of the two parts of the relationship we have calculated for (j = 1, p = 1; p<n; j*=2, m=p) { for (p = 0, i = n-j; i<n; i++) y[p++]=i;    The enumeration string is used to merge with the string I position because I is larger because the matched string is an empty string//Because the enumeration is a string of length J, then the string at the start of the I position will not be able to make up the string of this length, so the second keyword should be minimal, where the position of the front of the smaller    for (i = 0; i<n; i++) if (sa[i]>=j) y[p++]=sa[i]-j; Sa[i]-j begins with a string matching the second keyword with the number sa[i], the string of sa[i]<j is not used as the second keyword to match for (i = 0; i<n; i++) wv[i]=x[y[i];        Remove the first keyword for these positions for (i = 0; i<m; i++) ws1[i]=0;        for (i = 0; i<n; i++) ws1[wv[i]]++;        for (i = 1; i<m; i++) ws1[i]+=ws1[i-1]; for (i = n-1; i>=0; i--) sa[--ws1[wv[i]]]=y[i]; Base order of the first keyword for the second keyword for (swap (x, y), P=1,x[sa[0]]=0,i=1; i<n; i++)//One-time character set reduction, constant optimization for an ordered SA array x[sa[i]    ] = CMP (y,sa[i-1],sa[i],j)? p-1:p++; } return;    int rank[maxn],height[maxn];void calheight (int *r,int *sa,int N)//n Here is the original length of the string, that is, does not include the new 0{int i,j,k=0; for (i = 1; i<=n; i++) rank[sa[i]]=i; There is a suffix array to get the rank array, the No. 0 suffix must be added 0 for (i = 0; i<n; height[rank[i++]]=k)//The suffix beginning with I can always inherit the k-1 match from the suffix beginning with i-1 f or (k?k--:0, j=sa[rank[i]-1]; r[i+k] = = R[j+k]; k++); Make a violent match, but the time complexity of the whole algorithm is still O (n) return;} int main () {return 0;}


Template:


const int MAXN = 10010;int WA[MAXN], WB[MAXN], WV[MAXN], ws1[maxn];int cmp (int *r, int A, int b, int l) {return r[a]==r [B]&&r[a+l]==r[b+l];}        void da (int *r,int *sa,int n,int m) {int I, J, p, *x = Wa,*y = WB;     for (i = 0; i<m; i++) ws1[i]=0;     for (i = 0; i<n; i++) ws1[x[i]=r[i]]++;     for (i = 1; i<m; i++) ws1[i]+=ws1[i-1];    for (i = n-1; i>=0; i--) sa[--ws1[x[i]]]=i;                for (j = 1, p = 1; p<n; j*=2, M=p) {for (P = 0, i = n-j; i<n; i++) y[p++]=i;         for (i = 0; i<n; i++) if (sa[i]>=j) y[p++]=sa[i]-j;        for (i = 0; i<n; i++) wv[i]=x[y[i];        for (i = 0; i<m; i++) ws1[i]=0;        for (i = 0; i<n; i++) ws1[wv[i]]++;        for (i = 1; i<m; i++) ws1[i]+=ws1[i-1];         for (i = n-1; i>=0; i--) sa[--ws1[wv[i]]]=y[i];    for (Swap (x, y), P=1,x[sa[0]]=0,i=1; i<n; i++) x[sa[i]] = CMP (y,sa[i-1],sa[i],j)? p-1:p++; } return; int rank[maxn],height[maxn];void calheight (int *r,int *Sa,int n) {int i,j,k=0;     for (i = 1; i<=n; i++) rank[sa[i]]=i;     for (i = 0; i<n; height[rank[i++]]=k) for (k?k--:0, j=sa[rank[i]-1]; r[i+k] = = R[j+k]; k++); return;} int main () {return 0;}


Learning notes----suffix arrays

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.