"Fifth day of training • suffix array" wow haha ~

Last Update:2017-04-06 Source: Internet

Author: User

Tags time 0

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The front put a few days off (although we are in a missed lesson), so lazy to write blog. During the course of lessons learned the network flow, the following write the topic summary. The network flow is mainly modeled and slowly engaged.

Suffix array this thing is very difficult to engage in, when the teacher said when I (Shi) Lao water 2h, finish or a face confused force. Although the idea is to understand, but completely do not know what the code means. Cnm

Then one night I did not hand over the cell phone, sitting in the bed of the bedroom from 23 to chew the code to chew to 00:30, finally understand. Cnm

----------------------------------------------------------------------------Split Line

Boy, end your boring groove and get to the point!

Well, that's a good thing.

The idea of calculating and establishing a suffix array is roughly the same (multiplication method):

　　PS: When sorting, if the keywords are the same, then the first appear in front, that is, according to the suffix of the beginning of the letter in the original string in order. The reason for doing this is unclear, and I presume that it is possible to appear after the row in front, but there are 2 such sorts that must be kept consistent

1. Sort each character individually, draining the first SA array (cardinal sort, code a bit). Do not want to spit groove)

2. Enumerate the suffix lengths to be computed, for (int k=1;k<=n;k<<1) Here is the multiplication, sorted by the second keyword (using the SA array that was last evaluated), sorted by the first keyword (base sort, bucket bucket), and the SA array under the current K-length is discharged

It's over, isn't it simple?? (Intellectual disability)

It's worth noting that there are two optimizations in the code

1. When all positions are not ranked at the same time, i.e. at least n different rankings, break off loops. Because there's no egg to look back on. The first key word is the sort point.

2. Every time you recalculate the x array, the array of elements (this name should be called). ), representing the different elements in the previous ranking (no egg, just know it to rub the same element into a piece on the line), so that can optimize m, that is, the number of elements in the array

detailed illustrations and correctness references for ideas

http://blog.csdn.net/yxuanwkeith/article/details/50636898

The code is attached here (the detailed version of the Web, the comments super many)

CODE:

1 voidDaint(RNint*sa,intNintm)2 {3     intI,K,P,*X=WA,*Y=WB;//The x array corresponds to the rank,y array equivalent to the second keyword4      for(i=0; i<m;i++) ws[i]=0;//m is the maximum value of the character, and the WS array is used for auxiliary cardinality ordering5      for(i=0; i<n;i++) ws[x[i]=r[i]]++;//The calculation of WS and X,X is used only as a comparison sort, so it is not necessary to calculate the real position6      for(i=1; i<m;i++) ws[i]+=ws[i-1];//Calculate WS7      for(i=n-1; i>=0; i--) sa[--ws[x[i]]]=i;8     //calculate the SA value for the first time 0 2 1 39    Ten      for(k=1; k<=n;k<<=1)//Multiply, k is the length of the current string One     { Ap=0; -         //sort the second keyword, directly using the last calculated SA array -          for(i=n-k;i<n;i++) y[p++]=i;//the empty string must be small, so the order of the sa[] the          for(i=0; i<n;i++)if(sa[i]>=k) y[p++]=sa[i]-K; -         //The last SA left shift K bit has not disappeared, that is, sa[i]>=k, then write y sequentially to get the second keyword Order 3 1 0 2 -          -         //sort the first keyword +          for(i=0; i<m;i++) ws[i]=0;  -          for(i=0; i<n;i++) ws[x[y[i]]]++;  +          for(i=1; i<m;i++) ws[i]+=ws[i-1];  A          for(i=n-1; i>=0; i--) Sa[--ws[x[y[i]]]]=y[i];//0 2 3 1 at          -         //because the next time you want to use rank, you have to calculate the value of rank, which is ultimately stored in x, and y is useless at this time to temporarily store rank -Swap (x, y);//X, y is a pointer, so the value of the assignment × to Y can be directly exchanged for the value of the pointer -          -         //Calculate the new rank, because there may be the same string, so the same rank must be the same, -         //because SA has been found, then the use of the calculated SA value to seek, only need to find which strings are equal, rank the same, and then add 1 in each order in         //method is to determine whether rank[i] and rank[i+k] are the same -p=1, x[sa[0]]=0; to          for(i=1; i<n;i++) +x[sa[i]]= y[sa[i-1]==y[sa[i]] && y[sa[i-1]+K]==Y[SA[I]+K]? P1:p + +; -         if(p>=n) Break;//optimization, if the rankings are all different, then complete theM=p;//optimization, character Max is P, so m=p *     } $}

View Code

　　　　Application: LCP

However, in a specific application (looking for LCP), with a height array, this thing a bit around, a closer look can still think clearly

Here's a blog, which is very clear about this.

Http://www.cnblogs.com/LLGemini/p/4771235.html

There is a very important thing is the definition of H array and the height array, that is h[i]=height[rank[i]], remember this, because Height[i] is not good to calculate directly, we calculate h[i] and get the height array by the above formula.

In order to quickly obtain an H array (height array) We introduce a theorem h[i]>=h[i-1]-1, proving slightly

With this theorem, we can calculate the H array (the height array) in the time of O (n), because each time we calculate a new h[i], with the previous h[i-1]-1, and then see if there is more repetition, very cock, make zj addictive

Of course, today, in addition to looking at this thing, I also brush some network flow problems, but also in the following topics to write it together.

And oh, suffix array 13 specific routines I have not seen, and so I think the network to learn the same stream to see it.

Mood: In a few days will go out to save the election, but also a little excited, and can go out hey, do not know what I lgh recently ignore me and ZJ two people, to the time after we three live triple room will certainly appear some embarrassing situation, also can only improvise. Probably can let Lgh and Lence_ren sleep double room, I, ZJ, zbh together wave ...

"Fifth day of training • suffix array" wow haha ~

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More