Multiplication Algorithm and DC3 algorithm implemented by suffix Array

Source: Internet
Author: User

[Cpp]/********************************** * *********** data structure: suffix array (Suffix_Array); substring: substring of string S r [I .. j], I ≤ j, indicates the r string from I to j, that is, sequential arrangement of r [I], r [I + 1],..., A string formed by r [j]. Suffix: suffix refers to a special substring from a position I to the end of the entire string; the Suffix of string r starting from the I character is represented as Suffix (I), that is, Suffix (I) = r [I... len (r)]; suffix array SA: The suffix array stores the sorting results of all suffixes of a string; SA [I] stores the starting position of the suffix I in all strings; Rank: rank [I] stores the "Rank" of the suffix I in the ascending order of all suffixes. The suffix array is "Who is the Rank ", the ranking array is "the number of rows", that is, the suffix array and the ranking array are reciprocal operations. (1) Multiplication Algorithm: sort the substrings whose start length is 2 ^ k using the multiplication method to obtain the rank value. K starts from 0, and 1 is added each time. When 2 ^ k is greater than n, each character starts with a substring of 2 ^ k, which is equivalent to all suffixes. And these substrings must have been compared to the size, that is, the rank value does not have the same value, then the rank value is the final result. Each sort utilizes the rank value of the string with the last length of 2 ^ K-1, then the string with the length of 2 ^ k can be expressed with two strings with the length of 2 ^ K-1 as a keyword, and then sort the base, the rank value of the string with a length of 2 ^ k is obtained. (2) DC3 algorithm: ① divide the suffix into two parts, and then sort the suffix of the first part; ② sort the suffix of the second part based on the result of ①; ③ merge the results of ① and ② to sort all suffixes. Time Complexity: the time complexity of the multiplication algorithm is O (nlogn ), the time complexity of the DC3 algorithm is O (n). From the constant point of view, the constant of the DC3 algorithm is larger than that of the multiplication algorithm. spatial complexity: the space complexity of the multiplier algorithm and the DC3 algorithm is O (n). The total size of the array required by the multiplier algorithm is 6n, and the total size of the array required by the DC3 algorithm is 10n. RMQ (Range Minimum/Maximum Query) question: For Series A with the length of n, answer A number of questions RMQ (A, I, j) (I, j <= n) and return the subscript of Series A in I, the minimum (large) value in j, that is, the RMQ problem refers to the problem of finding the maximum value in the interval. LCA (Least Common Ancestors) Recent Common ancestor problem: for the two nodes u and v with the root tree T, the recent Common ancestor LCA (T, u, v) represents a node x, meet the requirements that x is the ancestor of u and v and the depth of x is as large as possible. Another way to understand T is to regard T as a undirected acyclic graph, while LCA (T, u, v) is the minimum depth point in the shortest path from u to v. Standard RMQ algorithm: this algorithm is first reduced to LCA (Lowest Common Ancestor), and then to RMQ, O (n)-O (q) constraints. First, based on the original sequence, a Cartesian tree is created, in this way, the problem is reduced to a LCA problem in a linear time; the LCA problem can be reduced to a constraint RMQ in a linear time, that is, the difference between any two adjacent numbers in the series is the RMQ problem of + 1 or-1. The online solution that limits RMQ to have O (n)-O (1) is as follows, therefore, the time complexity of the entire algorithm is O (n)-O (1); height array: defines height [I] = suffix (sa [I-1]) the longest public prefix of suffix (sa [I]), that is, the longest public prefix of the two adjacent suffixes. For j and k, if you set rank [j] <rank [k], the longest public prefix of suffix (j) and suffix (k) is: height [rank [j] + 1], height [rank [j] + 2], height [rank [j] + 3],…, The minimum value in height [rank [k; **************************************** * *******/# include <iostream> # include <cstring> # include <cstdlib> # include <cstdio> # include <climits> # include <algorithm> using namespace std; const int n= 100010; *********************** * ** int wa [N], wb [N], wv [N] ,__ ws [N]; int cmp (int * r, int a, int B, int l) {return r [a] = r [B] & r [a + l] = r [B + l];} void da (int * r, int * sa, int N, int m) {int * x = wa, * y = wb, * t; for (int I = 0; I <m; I ++) _ ws [I] = 0; for (int I = 0; I <n; I ++) _ ws [x [I] = r [I] ++; for (int I = 1; I <m; I ++) _ ws [I] + ==ws [I-1]; for (int I = n-1; I> = 0; I --) sa [-- _ ws [x [I] = I; for (int j = 1, p = 1; p <n; j * = 2, m = p) {p = 0; for (int I = n-j; I <n; I ++) y [p ++] = I; for (int I = 0; I <n; I ++) {if (sa [I]> = j) y [p ++] = sa [I]-j ;} for (int I = 0; I <n; I ++) wv [I] = x [y [I]; for (int I = 0; I <m; I ++) _ ws [I] = 0; fo R (int I = 0; I <n; I ++) _ ws [wv [I] ++; for (int I = 1; I <m; I ++) _ ws [I] + =__ ws [I-1]; for (int I = n-1; I> = 0; I --) sa [-- _ ws [wv [I] = y [I]; t = x, x = y, y = t, p = 1, x [sa [0] = 0; for (int I = 1; I <n; I ++) {x [sa [I] = cmp (y, sa [I-1], sa [I], j )? P-1: p ++;} return ;} ************************ ** // **************** DC3 algorithm ******************* * ******/# define F (x) (x)/3 + (x) % 3 = 1? 0: tb) # define G (x) <tb? (X) * 3 + 1 :( (x)-tb) * 3 + 2) int wa [N], wb [N], wv [N], _ ws [N]; int c0 (int * r, int a, int B) {return r [a] = r [B] & r [a + 1] = r [B + 1] & r [a + 2] = r [B + 2];} int c12 (int k, int * r, int a, int B) {if (k = 2) return r [a] <r [B] | r [a] = r [B] & c12 (1, r, a + 1, B + 1 ); else return r [a] <r [B] | r [a] = r [B] & wv [a + 1] <wv [B + 1];} void sort (int * r, int * a, int * B, int n, int m) {for (int I = 0; I <n; I ++) wv [I] = r [a [I]; for (int I = 0; I <m; I ++) _ ws [I] = 0; (Int I = 0; I <n; I ++) _ ws [wv [I] ++; for (int I = 1; I <m; I ++) _ ws [I] + = _ ws [I-1]; for (int I = n-1; I> = 0; I --) B [-- _ ws [wv [I] = a [I]; return;} void dc3 (int * r, int * sa, int n, int m) {int * rn = r + n, * san = sa + n, ta = 0, tb = (n + 1)/3, tbc = 0, p; r [n] = r [n + 1] = 0; for (int I = 0; I <n; I ++) {if (I % 3! = 0) wa [tbc ++] = I;} sort (r + 2, wa, wb, tbc, m); sort (r + 1, wb, wa, tbc, m); sort (r, wa, wb, tbc, m); p = 1, rn [F (wb [0])] = 0; for (int I = 1; I <tbc; I ++) {rn [F (wb [I])] = c0 (r, wb [I-1], wb [I])? P-1: p ++;} if (p <tbc) dc3 (rn, san, tbc, p); else for (int I = 0; I <tbc; I ++) san [rn [I] = I; for (int I = 0; I <tbc; I ++) {if (san [I] <tb) wb [ta ++] = san [I] * 3;} if (n % 3 = 1) wb [ta ++] = n-1; sort (r, wb, wa, ta, m); for (int I = 0; I <tbc; I ++) wv [wb [I] = G (san [I])] = I; int I, j; for (I = 0, j = 0, p = 0; I <ta & j <tbc; p ++) {sa [p] = c12 (wb [j] % 3, r, wa [I], wb [j])? Wa [I ++]: wb [j ++] ;}for (; I <ta; p ++) sa [p] = wa [I ++]; for (; j <tbc; p ++) sa [p] = wb [j ++]; return ;} ********************** * ***/int rank [N], height [N]; void calheight (int * r, int * sa, int n) {int I, j, k = 0; for (int I = 1; I <= n; I ++) rank [sa [I] = I; for (int I = 0; I <n; height [rank [I ++] = k) {for (k? K --: 0, j = sa [rank [I]-1]; r [I + k] = r [j + k]; k ++ );} return;} int RMQ [N]; int mm [N]; int best [20] [N]; void initRMQ (int n) {int I, j, a, B; for (mm [0] =-1, I = 1; I <= n; I ++) mm [I] = (I & (I-1 )) = 0 )? Mm [I-1] + 1: mm [I-1]; for (I = 1; I <= n; I ++) best [0] [I] = I; for (I = 1; I <= mm [n]; I ++) for (j = 1; j <= n + 1-(1 <I ); j ++) {a = best [I-1] [j]; B = best [I-1] [j + (1 <(I-1)]; if (RMQ [a] <RMQ [B]) best [I] [j] = a; else best [I] [j] = B;} return ;} int askRMQ (int a, int B) {int t; t = mm [B-a + 1]; B-= (1 <t)-1; a = best [t] [a]; B = best [t] [B]; return RMQ [a] <RMQ [B]? A: B;} int lcp (int a, int B) {int t; a = rank [a]; B = rank [B]; if (a> B) {t = a; a = B; B = t;} return (height [askRMQ (a + 1, B)]);}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.