Sample _c# for sorting strings using the cardinality sorting algorithm in C # tutorial

Source: Internet
Author: User

Before you start

Assuming that the length of the longest string is L, with L as the length of the input, and assuming that all strings are "padded" to this length, this complement is only logical, we can assume that there is a "null character", which is less than any other character, with this character to fill all the strings of insufficient length. For example: The longest string length is 9 and there is a string a length of 6, so when comparing the 7th character, we let a[7] be "null character".

If it doesn't seem easy to include all the characters, we first define a character set, and all the characters in the sorted string are included in this character set.

Character Set
private string _mycharset = "0123456789QWERTYUIOPASDFGHJKLZXCVBNM";

Another way to generate a random string (C # Implementation):

Private Random _random = new Random ();
 
String[] getrandstrings (int size, int minlength, int maxLength)
{
  string[] STRs = new String[size];
  int len = 0;
  StringBuilder sb = new StringBuilder (maxLength);
 
  for (int i = 0; i < STRs. Length; i++)
  {
    ///First randomly determine a length
    len = _random. Next (minlength, maxLength);
    for (int j = 0; J < Len J + +)
    {
      //randomly select a character
      sb. Append (_mycharset[_random. Next (_mycharset.length)]);
    Strs[i] = sb. ToString ();
    Sb. Clear ();
  }
  return strs;
}

This determines the range of the bucket by the integer representation of the character, and then prepares a bucket for the "null character." In order to represent the special case of "null character", it is represented here with default (char), that is, ' the word ', because when string is invoked. Elementatordefault (int) method, if the index is exceeded, it returns '.

Primary version (C #)

void Stringradixsort (string[] strarray) {if (Strarray = = NULL | | strarray.length = 0 | | strarray.contains (NUL
  L)) {return;
  }//Get the maximum length of the string int maxLength = 0;
    foreach (string s in Strarray) {if (S.length > MaxLength) {maxLength = s.length;
  }//To determine the integer range int rangestart = _mycharset[0] of the character;
  int rangeend = _mycharset[0];
    foreach (char ch in _mycharset) {if (ch < rangestart) RangeStart = ch;
  if (ch >= rangeend) rangeend = ch + 1;
  //Also allocate a bucket for "null character" with index 0 int bucketcount = rangeend-rangestart + 1;
 
  linkedlist<string>[] Buckets = new linkedlist<string>[bucketcount]; Initializes all bucket for (int i = 0; i < buckets. Length;
  i++) {Buckets[i] = new linkedlist<string> ();
  //start sorting int currentindex = MaxLength-1 from the last character;
      while (currentindex >= 0) {foreach (string thestring in Strarray) {//If the index is exceeded, returns ' the ' ' (default char) char ch = thestring.Elementatordefault (Currentindex); if (ch = = Default (char)) {//"null character" processing buckets[0].
      AddLast (thestring);
        else {///map character to bucket int index = Ch-rangestart + 1; Buckets[index].
      AddLast (thestring);
    and/or from the bucket to retrieve the string sequentially, complete a trip to sort int i = 0; foreach (linkedlist<string> bucket in buckets) {while bucket. Count > 0) {strarray[i++] = bucket.
        A (); Bucket.
      Removefirst ();
  }} currentindex--;

 }
}

A little "improved"

The code for the integer range used to determine the character is slightly painful, and according to the character set, it is not possible for all the integers in the interval to appear, so there is a case where we assign buckets to certain characters that don't appear at all, which is a waste. We can use a dictionary (hash) to record the mapping between the character and its bucket. So the following code is available.

Private Dictionary<char, int> _charorderdict = 
        new Dictionary<char, int> (_mycharset.length);
void Buildcharorderdict ()
{
  char[] Sortedcharset = _mycharset.toarray ();
  Use the default comparer to sort
  array.sort (sortedcharset);
  Create a separate mapping _charorderdict.add for "null characters"
  (default (char), 0);
  for (int i = 0; i < sortedcharset.length i++)
  {
    //Save the index _charorderdict.add of the character and its corresponding bucket
    (sortedcharset[ I], i + 1);
  }

You can also define the size relationships between characters without using the default character ordering as a mapping. Here is the adjusted code:

void Stringradixsort (string[] strarray) {if (Strarray = = NULL | | strarray.length = 0 | | strarray.contains (NUL
  L)) {return;
  }//Get the maximum length of the string int maxLength = 0;
    foreach (string s in Strarray) {if (S.length > MaxLength) {maxLength = s.length;
  Assign a bucket//"null character" index for each character (including the null character ' ") to 0 int bucketcount = _mycharset.length + 1;
 
  linkedlist<string>[] Buckets = new linkedlist<string>[bucketcount]; Initializes all bucket for (int i = 0; i < buckets. Length;
  i++) {Buckets[i] = new linkedlist<string> ();
  //start sorting int currentindex = MaxLength-1 from the last character;
      while (currentindex >= 0) {foreach (string thestring in Strarray) {//If the index is exceeded, returns ' the ' ' (default char)
      char ch = thestring.elementatordefault (currentindex);
      query character int index = _charorderdict[ch] According to character order definition; Buckets[index].
    AddLast (thestring);
    //From the bucket to retrieve the string sequentially, complete a trip to sort int i = 0; foreach (LINKEDLIST&LT;STRING&GT Bucket in buckets) {while (bucket). Count > 0) {strarray[i++] = bucket.
        A (); Bucket.
      Removefirst ();
  }} currentindex--;
 }
}

Now, it works! If a quick sort is used, its time complexity is O (N∗LOGN) O (N∗logn). On the surface, cardinality sorting is better, but strictly speaking, the time complexity of the cardinality sort should be O (k∗n) O (k∗n), where k is positively correlated to the length of the string. At this time, the comparison between the two algorithms can be approximated by comparing the results of K and Lognlogn. If the length of the string is very long, that is, K is very large, and the size of the input n small, there will be K>lognlogn, at this time the rapid ranking instead more advantages. Conversely, the cardinality order may be better.

At last...

The cup was, when I enlarged the character set, added all the characters on the keyboard, and found that the result of the cardinality order was not the same as the sort result of the Array.Sort (string[] method. Careful observation of the resource manager's ordering of file names reveals that the rules for string sorting are more complex than simple comparison characters. Query the relevant data found that the order of the string even to consider the impact of regional culture, even if the Latin alphabet, different regions of the collation may not be the same, therefore, the use of cardinal order implementation of the string sorting algorithm does not seem to have much practical value <T-T>.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.