Suffix Array Learning notes

Source: Internet
Author: User

Mother is the suffix array abused ... Start by looking at the suffix tree. Seems to be very troublesome, and then look at the suffix array, the theory seems to be very understood appearance, will not achieve it.

This fully proves how stupid I have been, spent a whole day to understand how to achieve, or because I did not learn the base order?

(update: Then spent a day looking at the height ... )


Take a good look at this picture ... Very important.

The time complexity of no-brain direct sequencing is O (N^2LOGN) because the string comparison is O (n)

Here we introduce the method of using multiplication to find the suffix array.

The ordering of the first order of each character backwards 2^i a string (see above for a bit), no brain sort O (nlogn), but you can use the nature of the suffix O (n) to sort the cardinality.

The first time you sort a single character, pull it out alone, and the bucket is straight. From the second start, we want to rank as the first keyword above, and then the last half of the string to fill the previous rank as the second keyword to sort (see the brain complement), this time to start the real Cardinal sort. First maintain a second keyword increment sequence, after the no second keyword is considered 0, this can take advantage of the last SA array to get (see the Code brain complement): Then put the first keyword one by one into the bucket, and then the most critical place, in the second keyword descending order to remove the contents of the bucket, because the first keyword is the same as the second key word big bigger well. This time the new SA is done. Then, to maintain the new rank, process the SA array in the order of discretization (see the Code for Brain compensation).

In order to play the power of the suffix array, we also ask for a height array, height[i] that represents the suffix of the rank I and the longest common prefix of the i-1 suffix of the rank, height[1]=0, then there is a nature, height[rank[i]]>=height[ Rank[i]-1]-1, this reading paper proves. This can be obtained from the 1-n Order of Height[rank[i]], query any two suffixes of the longest public prefix is for min (Height[j]) Rank[j] in these two suffixes of rank directly, you can RMQ do.

Because this seems to have too much to see the code to understand the place, I can only say that tat.

Uoj 35:

#include <cstdio> #include <iostream> #include <cstring> #define N 200005using namespace Std;int i,j,n, Rank[n],sa[n],r2[n],buc[n],sec[n],h[n];char s[n];bool cmp (int x,int y,int d) {return r2[x]==r2[y]&&r2[x+d]== R2[Y+D];} void Getsa () {int i,p=0,d=1;for (i=0;i<=n;i++) buc[i]=0;for (i=0;i<n;i++) buc[rank[i]= (int) s[i]]++;for (i=1;i <=256;i++) buc[i]+=buc[i-1];for (i=n-1;i>=0;i--) sa[--buc[rank[i]]]=i;while (p<n) {for (i=0;i<n;i++) r2[i ]=rank[i],buc[i+1]=0;for (p=0,i=n-d;i<n;i++) sec[p++]=i;for (i=0;i<n;i++) if (Sa[i]>=d) sec[p++]=sa[i]-d; for (i=0;i<n;i++) buc[rank[sec[i]]]++;for (i=1;i<=n;i++) buc[i]+=buc[i-1];for (i=n-1;i>=0;i--) sa[--buc[ Rank[sec[i]]]]=sec[i];for (rank[sa[0]]=1,i=1,p=1;i<n;i++) rank[sa[i]]=cmp (sa[i],sa[i-1],d)? p:++p;d*=2;}} void Geth () {int i,j,k;for (i=0,k=0;i<n;i++) if (rank[i]==1) H[1]=0;else{j=sa[rank[i]-2];while (s[j+k]==s[i+k]) k++; H[rank[i]]=k;if (k) k--;} }int Main () {scanf ("%s", s), N=strlen (s); for (i=n;i<2*n;i++) s[i]= ' $ '; Getsa (); if (n==1) Rank[0]=1;geth (); for (i=0;i<n;i++) printf ("%d", sa[i]+1);p rintf ("\ n"); for (i=2; i<=n;i++) printf ("%d", H[i]);}


Suffix Array Learning notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.