The front put a few days off (although we are in a missed lesson), so lazy to write blog. During the course of lessons learned the network flow, the following write the topic summary. The network flow is mainly modeled and slowly engaged.
Suffix array this thing is very difficult to engage in, when the teacher said when I (Shi) Lao water 2h, finish or a face confused force. Although the idea is to understand, but completely do not know what the code means. Cnm
Then one night I did not hand over the cell phone, sitting in the bed of the bedroom from 23 to chew the code to chew to 00:30, finally understand. Cnm
----------------------------------------------------------------------------Split Line
Boy, end your boring groove and get to the point!
Well, that's a good thing.
The idea of calculating and establishing a suffix array is roughly the same (multiplication method):
PS: When sorting, if the keywords are the same, then the first appear in front, that is, according to the suffix of the beginning of the letter in the original string in order. The reason for doing this is unclear, and I presume that it is possible to appear after the row in front, but there are 2 such sorts that must be kept consistent
1. Sort each character individually, draining the first SA array (cardinal sort, code a bit). Do not want to spit groove)
2. Enumerate the suffix lengths to be computed, for (int k=1;k<=n;k<<1) Here is the multiplication, sorted by the second keyword (using the SA array that was last evaluated), sorted by the first keyword (base sort, bucket bucket), and the SA array under the current K-length is discharged
It's over, isn't it simple?? (Intellectual disability)
It's worth noting that there are two optimizations in the code
1. When all positions are not ranked at the same time, i.e. at least n different rankings, break off loops. Because there's no egg to look back on. The first key word is the sort point.
2. Every time you recalculate the x array, the array of elements (this name should be called). ), representing the different elements in the previous ranking (no egg, just know it to rub the same element into a piece on the line), so that can optimize m, that is, the number of elements in the array
detailed illustrations and correctness references for ideas
http://blog.csdn.net/yxuanwkeith/article/details/50636898
The code is attached here (the detailed version of the Web, the comments super many)
CODE:
1 voidDaint(RNint*sa,intNintm)2 {3 intI,K,P,*X=WA,*Y=WB;//The x array corresponds to the rank,y array equivalent to the second keyword4 for(i=0; i<m;i++) ws[i]=0;//m is the maximum value of the character, and the WS array is used for auxiliary cardinality ordering5 for(i=0; i<n;i++) ws[x[i]=r[i]]++;//The calculation of WS and X,X is used only as a comparison sort, so it is not necessary to calculate the real position6 for(i=1; i<m;i++) ws[i]+=ws[i-1];//Calculate WS7 for(i=n-1; i>=0; i--) sa[--ws[x[i]]]=i;8 //calculate the SA value for the first time 0 2 1 39 Ten for(k=1; k<=n;k<<=1)//Multiply, k is the length of the current string One { Ap=0; - //sort the second keyword, directly using the last calculated SA array - for(i=n-k;i<n;i++) y[p++]=i;//the empty string must be small, so the order of the sa[] the for(i=0; i<n;i++)if(sa[i]>=k) y[p++]=sa[i]-K; - //The last SA left shift K bit has not disappeared, that is, sa[i]>=k, then write y sequentially to get the second keyword Order 3 1 0 2 - - //sort the first keyword + for(i=0; i<m;i++) ws[i]=0; - for(i=0; i<n;i++) ws[x[y[i]]]++; + for(i=1; i<m;i++) ws[i]+=ws[i-1]; A for(i=n-1; i>=0; i--) Sa[--ws[x[y[i]]]]=y[i];//0 2 3 1 at - //because the next time you want to use rank, you have to calculate the value of rank, which is ultimately stored in x, and y is useless at this time to temporarily store rank -Swap (x, y);//X, y is a pointer, so the value of the assignment × to Y can be directly exchanged for the value of the pointer - - //Calculate the new rank, because there may be the same string, so the same rank must be the same, - //because SA has been found, then the use of the calculated SA value to seek, only need to find which strings are equal, rank the same, and then add 1 in each order in //method is to determine whether rank[i] and rank[i+k] are the same -p=1, x[sa[0]]=0; to for(i=1; i<n;i++) +x[sa[i]]= y[sa[i-1]==y[sa[i]] && y[sa[i-1]+K]==Y[SA[I]+K]? P1:p + +; - if(p>=n) Break;//optimization, if the rankings are all different, then complete theM=p;//optimization, character Max is P, so m=p * } $}
View Code
Application: LCP
However, in a specific application (looking for LCP), with a height array, this thing a bit around, a closer look can still think clearly
Here's a blog, which is very clear about this.
Http://www.cnblogs.com/LLGemini/p/4771235.html
There is a very important thing is the definition of H array and the height array, that is h[i]=height[rank[i]], remember this, because Height[i] is not good to calculate directly, we calculate h[i] and get the height array by the above formula.
In order to quickly obtain an H array (height array) We introduce a theorem h[i]>=h[i-1]-1, proving slightly
With this theorem, we can calculate the H array (the height array) in the time of O (n), because each time we calculate a new h[i], with the previous h[i-1]-1, and then see if there is more repetition, very cock, make zj addictive
Of course, today, in addition to looking at this thing, I also brush some network flow problems, but also in the following topics to write it together.
And oh, suffix array 13 specific routines I have not seen, and so I think the network to learn the same stream to see it.
Mood: In a few days will go out to save the election, but also a little excited, and can go out hey, do not know what I lgh recently ignore me and ZJ two people, to the time after we three live triple room will certainly appear some embarrassing situation, also can only improvise. Probably can let Lgh and Lence_ren sleep double room, I, ZJ, zbh together wave ...
"Fifth day of training • suffix array" wow haha ~