intWa[n], Wb[n], ws[n], Wv[n]intRank[n], Height[n]#此处N比输入的N要多1, a character added for manual use to avoid CMP time out of boundsvoidGetsa (int(RNint*sa,intNintMintI, J, p, *x = WA, *y = WB, *t# Buckets emptied for(i =0; I < m; i++) Ws[i] =0 #进行一次基数排序 for(i =0; I < n; i++) Ws[x[i] = r[i]]++ for(i =1; I < m; i++) ws[i] + = ws[i-1] for(i = n-1; I >=0; i--) Sa[--ws[x[i]] = i#倍增法 for(j =1, p =1; P < n; J *=2, M = p)#string [N-j. n] is a suffix that has no offset to J so the second keyword defaults to 0 #所以按照第二关键字排序肯定在最前面 also need to ensure stability for*pb=0, i = n-j; I < n; i++) y[p++] = i#SA定义为 ' Who's in the row ' when sorting by the second keyword #第二关键字的次序就是其关联的第一关键字未完成排序时的次序 #所以SA [i]–j the first keyword sequence for incomplete sorting collected in the second keyword order for(i =0; I < n; i++)if(Sa[i] >= j) y[p++] = Sa[i]–j#这里的x是临时的RANK数组 y is the first keyword to wait for sorting the subscript array (that is, the suffix i) #映射出第一关键字之间的相对大小 for(i =0; I < n; i++) Wv[i] = X[y[i]]#bucket清空 for(i =0; I < m; i++) Ws[i] =0 #进行一次基数排序 for(i =0; I < n; i++) ws[wv[i]]++ for(i =1; I < m; i++) ws[i] + = ws[i-1] for(i = n-1; I >=0; i--) sa[--ws[wv[i]] = Y[i]#交换x y re-label temporary SA for(t = x, x = y, y = t, p =1, x[sa[0]] =0, i =1; I < n; i++) X[sa[i] = cmp (y, Sa[i-1], Sa[i], j)? P-1: p++voidGetHeight (int*r,int*sa,intNintI, j, k =0 #根据SA求RANK for(i =1; I <= N; i++) Rank[sa[i] = i#利用性质 H[i] = Height[rank[i]] Reduce computation time #h [i] = height[rank[i]] = = The height value of the suffix #第i后缀必然比第i-1 suffix short simultaneous note i suffix = i-1 suffix culling first letter #记符号S (i) is the first suffix of the original string #记符号P (i) Suffix of SA value-1 for suffix i #h [i] = height[rank[i]] = = i suffix and i suffix sa value-1 corresponding suffix of LCP #h [I-1] = height[rank[i-1]] = i-1 suffix and the SA value of the i-1 suffix-1 corresponding suffix of LCP #h [i] = LCP (S (i), P (i)) h[i-1] = LCP (S (i-1), P (i-1)) #容易看出S (i-1) to remove the initial letter and change to S (i) #若h [i-1] >= 1 S (i-1) and P (i-1) LCP >= 1 #P (i-1) must be a suffix at the same time because H[i-1] >= 1 strlen (P (i-1)) >= 1 #所以P (i-1) culling of the first letter is bound to get a suffix #同时由S (i-1) culling of the first letter can be obtained S (i) and P (i) #如果下标从1开始 #那么P (i-1) and P (i) are bound to meet P (i-1) [1.. h[i-1]] = = P (i) [0.. h[i-1]] #但P (i-1) [h[i-1] + 1:-1] and P (i) [h[i-1] + 1:-1] not necessarily equal #所以h [i] >= h[i-1]-1 #反之从0求h [i] must be correct #补充 Height[i] = LCP (String[sa[i-1]:-1], string[sa[i]..-1]) for(i =0; I < n; height[rank[i++]] = k) for(K. k--:0, j = Sa[rank[i]-1]; R[i + K] = = R[j + K]; k++)Char StrNintSa[n]intMainChar Str[N] scanf ('%s ',Str)intn = strlen (Str)Str[N] =0 #注意区分此处为n +1 because a trailing character was added to distinguish the comparisonGetsa (Str, SA, n +1, -) GetHeight (Str, SA, N)
"Suffix array" notes on the suffix array template continued