"Suffix array" notes on the suffix array template continued

Source: Internet
Author: User

intWa[n], Wb[n], ws[n], Wv[n]intRank[n], Height[n]#此处N比输入的N要多1, a character added for manual use to avoid CMP time out of boundsvoidGetsa (int(RNint*sa,intNintMintI, J, p, *x = WA, *y = WB, *t# Buckets emptied     for(i =0; I < m; i++) Ws[i] =0    #进行一次基数排序     for(i =0; I < n; i++) Ws[x[i] = r[i]]++ for(i =1; I < m; i++) ws[i] + = ws[i-1] for(i = n-1; I >=0; i--) Sa[--ws[x[i]] = i#倍增法     for(j =1, p =1; P < n; J *=2, M = p)#string [N-j. n] is a suffix that has no offset to J so the second keyword defaults to 0        #所以按照第二关键字排序肯定在最前面 also need to ensure stability         for*pb=0, i = n-j; I < n; i++) y[p++] = i#SA定义为 ' Who's in the row ' when sorting by the second keyword        #第二关键字的次序就是其关联的第一关键字未完成排序时的次序        #所以SA [i]–j the first keyword sequence for incomplete sorting collected in the second keyword order         for(i =0; I < n; i++)if(Sa[i] >= j) y[p++] = Sa[i]–j#这里的x是临时的RANK数组 y is the first keyword to wait for sorting the subscript array (that is, the suffix i)        #映射出第一关键字之间的相对大小         for(i =0; I < n; i++) Wv[i] = X[y[i]]#bucket清空         for(i =0; I < m; i++) Ws[i] =0        #进行一次基数排序         for(i =0; I < n; i++) ws[wv[i]]++ for(i =1; I < m; i++) ws[i] + = ws[i-1] for(i = n-1; I >=0; i--) sa[--ws[wv[i]] = Y[i]#交换x y re-label temporary SA         for(t = x, x = y, y = t, p =1, x[sa[0]] =0, i =1; I < n; i++) X[sa[i] = cmp (y, Sa[i-1], Sa[i], j)? P-1: p++voidGetHeight (int*r,int*sa,intNintI, j, k =0    #根据SA求RANK     for(i =1; I <= N; i++) Rank[sa[i] = i#利用性质 H[i] = Height[rank[i]] Reduce computation time    #h [i] = height[rank[i]] = = The height value of the suffix    #第i后缀必然比第i-1 suffix short simultaneous note i suffix = i-1 suffix culling first letter    #记符号S (i) is the first suffix of the original string    #记符号P (i) Suffix of SA value-1 for suffix i    #h [i] = height[rank[i]] = = i suffix and i suffix sa value-1 corresponding suffix of LCP    #h [I-1] = height[rank[i-1]] = i-1 suffix and the SA value of the i-1 suffix-1 corresponding suffix of LCP    #h [i] = LCP (S (i), P (i)) h[i-1] = LCP (S (i-1), P (i-1))    #容易看出S (i-1) to remove the initial letter and change to S (i)    #若h [i-1] >= 1 S (i-1) and P (i-1) LCP >= 1    #P (i-1) must be a suffix at the same time because H[i-1] >= 1 strlen (P (i-1)) >= 1    #所以P (i-1) culling of the first letter is bound to get a suffix    #同时由S (i-1) culling of the first letter can be obtained S (i) and P (i)    #如果下标从1开始    #那么P (i-1) and P (i) are bound to meet P (i-1) [1.. h[i-1]] = = P (i) [0.. h[i-1]]    #但P (i-1) [h[i-1] + 1:-1] and P (i) [h[i-1] + 1:-1] not necessarily equal    #所以h [i] >= h[i-1]-1    #反之从0求h [i] must be correct    #补充 Height[i] = LCP (String[sa[i-1]:-1], string[sa[i]..-1])     for(i =0; I < n; height[rank[i++]] = k) for(K. k--:0, j = Sa[rank[i]-1]; R[i + K] = = R[j + K]; k++)Char StrNintSa[n]intMainChar Str[N] scanf ('%s ',Str)intn = strlen (Str)Str[N] =0    #注意区分此处为n +1 because a trailing character was added to distinguish the comparisonGetsa (Str, SA, n +1, -) GetHeight (Str, SA, N)

"Suffix array" notes on the suffix array template continued

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.