| Spoj problem set (classical) 694. Distinct substringsproblem code: disubstr |
Given a string, we need to find the total number of its distinct substrings.
Input
T-Number of test cases. T <= 20;
Each test case consists of one string, whose length is <=1000
Output
For each test case output one number saying the number of distinct substrings.
Example
Sample input:
2
CCCCC
Ababa
Sample output:
5
9
Explanation for the testcase with string Ababa:
Len = 1: A, B
Len = 2: AB, Ba
Len = 3: ABA, bab
Len = 4: Abab, Baba
Len = 5: Ababa
Thus, total number of distinct substrings is 9.
1. Open different substrings with a prefix: we can see that a string with a length of I has a total of I prefixes.
2. Each substring is the prefix of a suffix, so the problem is equivalent to finding the number of different prefixes.
Then follow SA [1], sa [2]... to add the suffix one by one to observe:
Suffix (SA [I]) is a string of N-sa [I], with a total of N-sa [I] prefixes, minus LCP [I-1] (the length of the longest common prefix with the previous suffix) is the number of new prefixes added
The final sum is enough.
Note that my LCP [I] refers to the length of the Public prefix of suffix (SA [I]) and suffix (SA [I + 1 ]).
# Include <cstdio> # include <iostream> # include <string> # include <algorithm> # include <cmath> # include <cstring> using namespace STD; # define maxn 1011int N, k; // n = strlen (s); int rank [maxn]; int TMP [maxn]; char s [maxn]; int LCP [maxn], sa [maxn]; /* sort Sa by rank */bool cmpsa (int I, Int J) {If (rank [I]! = Rank [J]) return rank [I] <rank [J]; rank [T] Under else {/*, which must start with T and have a length less than or equal to k/2, sa [I] only has the suffix starting with I, but has different lengths */INT rI = I + k <= n? Rank [I + k]:-1; int RJ = J + k <= n? Rank [J + k]:-1; return RI <RJ ;}/ * calculate Sa */void consa () {/* n = strlen (s ); if necessary, specify * // * to initialize SA and rank to ensure two points. 1. Rank [I] indicates the maximum number of Subscripts for I and must indicate the relative size, it can be directly represented by characters 2. Sa [1... n] value is 1 .. N */For (INT I = 0; I <= N; I ++) {SA [I] = I; rank [I] = I <n? S [I]:-1;}/* sort strings with a length of 2 * K using a string with a length of K */For (k = 1; k <= N; K * = 2)/* Note that K is a global variable in this Code. The loop must start from 1 because 0*2 = 0 */{sort (SA, sa + n + 1, cmpsa); TMP [SA [0] = 0;/* at this time, TMP is only saved for rank */For (INT I = 1; I <= N; I ++) {TMP [SA [I] = TMP [SA [I-1] + (cmpsa (SA [I-1], sa [I])? 1-0);/* This sentence is very important. The SA [I] on the right of the equal sign indicates a string with a length smaller than or equal to k/2 in this loop, then, the SA [I] */} For (INT I = 0; I <= N; I ++) of the string whose length is smaller than or equal to k is obtained) {rank [I] = TMP [I] ;}} void construct_lcp () {// n = strlen (s); For (INT I = 0; I <= N; I ++) rank [SA [I] = I; int h = 0; LCP [0] = 0; For (INT I = 0; I <N; I ++) {Int J = sa [rank [I]-1]; If (h> 0) h --; (; J + H <n & I + H <n; H ++) {If (s [J + H]! = S [I + H]) break;} LCP [rank [I]-1] = H ;}} int main () {int T, ans; scanf ("% d", & T); While (t --) {ans = 0; scanf ("% s", S); n = strlen (s ); consa (); construct_lcp (); For (INT I = 1; I <= N; I ++) {ans + = N-sa [I]-LCP [I-1];} printf ("% d \ n", ANS);} return 0; return 0 ;}