19. String Sorting algorithm

Source: Internet
Author: User
Tags index sort

Alphabet class

Some applications may restrict the alphabet of strings. In these applications, it may often be necessary to require an API to represent the alphabet class (just a reference, and will not use the class to discuss the algorithm)

 Public  class Alphabet {    /** * The binary alphabet {0, 1}. */     Public Static FinalAlphabet BINARY =NewAlphabet (" the");/** * The octal alphabet {0, 1, 2, 3, 4, 5, 6, 7}. */     Public Static FinalAlphabet octal =NewAlphabet ("01234567");/** * The decimal alphabet {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. */     Public Static FinalAlphabet DECIMAL =NewAlphabet ("0123456789");/** * The hexadecimal alphabet {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F}. */     Public Static FinalAlphabet hexadecimal =NewAlphabet ("0123456789ABCDEF");/** * The DNA alphabet {A, C, T, G}. */     Public Static FinalAlphabet DNA =NewAlphabet ("ACTG");/** * The lowercase alphabet {A, B, C, ..., z}. */     Public Static FinalAlphabet lowercase =NewAlphabet ("ABCDEFGHIJKLMNOPQRSTUVWXYZ");/** * The uppercase alphabet {A, B, C, ..., Z}. */     Public Static FinalAlphabet uppercase =NewAlphabet ("ABCDEFGHIJKLMNOPQRSTUVWXYZ");/** * The protein alphabet {A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y}. */     Public Static FinalAlphabet PROTEIN =NewAlphabet ("Acdefghiklmnpqrstvwy");/** * The base-64 alphabet (characters). */     Public Static FinalAlphabet BASE64 =NewAlphabet ("abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz0123456789+/");/** * The ASCII alphabet (0-127). */     Public Static FinalAlphabet ASCII =NewAlphabet ( -);/** * The extended ASCII alphabet (0-255). */     Public Static FinalAlphabet Extended_ascii =NewAlphabet ( the); Public Static FinalAlphabet UNICODE16 =NewAlphabet (65536);Private Char[] Alphabet;//The characters in the alphabet    Private int[] inverse;//Indices    Private intR//The radix of the alphabet     Public Alphabet(String Alpha) {//Check that alphabet contains no duplicate chars        Boolean[] Unicode =New Boolean[Character.max_value]; for(inti =0; I < alpha.length (); i++) {Charc = Alpha.charat (i);if(Unicode[c])Throw NewIllegalArgumentException ("illegal alphabet:repeated character = '"+ C +"'"); UNICODE[C] =true;        } alphabet = Alpha.tochararray ();        R = Alpha.length (); Inverse =New int[Character.max_value]; for(inti =0; i < inverse.length; i++) Inverse[i] =-1;//Can ' t use char since R can be as big as 65,536         for(intc =0; c < R;    C + +) Inverse[alphabet[c]] = c; }Private Alphabet(intR) {alphabet =New Char[R]; Inverse =New int[R]; This. R = r;//Can ' t use char since R can be as big as 65,536         for(inti =0; i < R; i++) Alphabet[i] = (CharI for(inti =0; i < R;    i++) Inverse[i] = i; } Public Alphabet() { This( the); } Public Boolean contains(Charc) {returnINVERSE[C]! =-1; } Public int R() {returnR } Public int LgR() {intLgR =0; for(intT = r1; T >=1; T/=2) lgr++;returnLgR; } Public int Toindex(Charc) {if(c >= Inverse.length | | inverse[c] = =-1) {Throw NewIllegalArgumentException ("Character"+ C +"Not in Alphabet"); }returnINVERSE[C]; } Public int[]toindices(String s) {Char[] Source = S.tochararray ();int[] target =New int[S.length ()]; for(inti =0; i < source.length; i++) Target[i] = Toindex (Source[i]);returnTarget } Public Char ToChar(intIndex) {if(Index <0|| Index >= R) {Throw NewIndexoutofboundsexception ("Alphabet index out of Bounds"); }returnAlphabet[index]; } PublicStringTochars(int[] indices) {StringBuilder s =NewStringBuilder (indices.length); for(inti =0; i < indices.length; i++) S.append (ToChar (indices[i]));returnS.tostring (); }}
String Sort Index Counting method

The input string and the string corresponding to the group (the group is also the key of the string), in the case of small to large sorting of the group, the string alphabetically sorted

1. Record the frequency of the group (in order to get a string in the sorted range, such as it must be less than the size of its group of strings, larger than its group smaller string)
cout record frequency, the location of the record is the key value +1, plus 1 is convenient for the post-update key position starting point.

2. Convert to index (get location starting point for each group)

3. Sorting, sorting
First sort (This example input is the order of the row), sorted out again (classification is that the class has an element position, put the lower bound of the class +1, to the elements of the following class)


Index counting method is stable (in terms of the stability of the order)

Low Priority ranking

In combination with the index sort, from the low of the string (starting from the right), to the left, each character is once the key of the string , to order the entire string, every time to the high row, the string order may be different from the previous, but the whole process is finished, the string is ordered. (or the nature of stability is the key, what is stability, that is, if there is the same as high, the sequence of the strings must also be ranked by the high rank)

 Public  class LSD {    Private Static Final intBits_per_byte =8;// do not instantiate    PrivateLSD () {} Public Static voidSort (string[] a,intW) {intN = A.length;intR = the;//Extend ASCII alphabet sizestring[] aux =NewString[n]; for(intD = W1; D >=0; d--) {//Sort by key-indexed counting on DTH character            //Compute frequency counts            int[]Count=New int[r+1]; for(inti =0; i < N; i++)Count[A[i].charat (d) +1]++;//Compute cumulates             for(intR =0; R < R; r++)Count[r+1] +=Count[R];//Move Data             for(inti =0; i < N; i++) aux[Count[A[i].charat (d)]++] = A[i];//Copy back             for(inti =0; i < N;        i++) A[i] = Aux[i]; }    } Public Static voidSortint[] a) {intBITS = +;//Each int. is        intW = Bits/bits_per_byte;//Each int. is 4 bytes        intR =1<< Bits_per_byte;//Bytes is between 0 and 255        intMASK = R-1;//0xFF        intN = A.length;int[] aux =New int[N]; for(intD =0; D < W; d++) {//Compute frequency counts            int[]Count=New int[r+1]; for(inti =0; i < N; i++) {intc = (A[i] >> bits_per_byte*d) & MASK;Count[C +1]++; }//Compute cumulates             for(intR =0; R < R; r++)Count[r+1] +=Count[R];//For most significant byte, 0x80-0xff comes before 0x00-0x7f            if(d = = W1) {intSHIFT1 =Count[R]-Count[r/2];intshift2 =Count[r/2]; for(intR =0; R < r/2; r++)Count[r] + = SHIFT1; for(intr = r/2; R < R; r++)Count[r]-= shift2; }//Move Data             for(inti =0; i < N; i++) {intc = (A[i] >> bits_per_byte*d) & MASK; aux[Count[c]++] = a[i]; }//Copy back             for(inti =0; i < N;        i++) A[i] = Aux[i]; }    } Public Static voidMain (string[] args) {string[] a = stdin.readallstrings ();intN = A.length;//Check that strings has fixed length        intW = a[0].length (); for(inti =0; i < N; i++) assert a[i].length () = = W:"Strings must has fixed length";//Sort the stringsSort (A, W);//Print results         for(inti =0; i < N;    i++) stdout.println (A[i]); }}

High priority string Ordering

Sort from left to right in cases where the length of the string is not necessarily the same. Unlike the above, where a high level is placed, the high is ignored, and then the same high position is sorted (based on the high-cut string) to the right

19. String Sorting algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.