C # random Chinese Character Verification Code

Source: Internet
Author: User

1. Chinese character encoding principles
How can we generate Chinese characters at random? Where do Chinese characters come from? Is there a back-end data table that stores all the required Chinese Characters? Can I use a program to randomly retrieve a few Chinese characters? Using the background database to store all Chinese characters at random, this is also a way, but how to make so many Chinese characters? In fact, you can do this without using any background database. To learn how to generate Chinese characters, you must first understand the encoding principles of Chinese characters.
In 1980, in order to make every Chinese character has a unified national code, China issued the first Chinese character encoding National Standard: GB2312-80 "information exchange with Chinese character encoding Character Set" basic set, referred to as gb2312, this character set is the Development Basis of Chinese Information Processing Technology in China and the unified standard of all Chinese character systems in China. Later published the National Standard GB18030-2000 "information exchange with Chinese character encoding Character Set Basic Set expansion", referred to as gb18030, programming if it involves coding and localization of friends should be very familiar with gb18030. This is the most important Chinese character encoding standard after GB2312-1980 and GB13000-1993 in China, and it is also one of the basic standards that China's computer system must follow in the future.
Currently, in Chinese Windows operating systems, the default code page in. NET programming is gb18030 Simplified Chinese. But in fact, if you generate a Chinese character verification code, you only need to use the gb2312 character set. The character set contains many Chinese characters that we do not know and seldom see. If the Chinese character verification code generated contains many Chinese characters that we do not know, it is not a good thing for our friends who use the Pinyin input method. The five users can barely extract the Chinese characters based on their appearance! Therefore, we do not use all Chinese characters in the gb2312 character set.
The Chinese character can be expressed by a location code. See

Chinese character location code table http://cs.scu.edu.cn /~ Wangbo/Others/quweima.htm

Chinese character location code table http://navicy2005.home4u.china.com/resource/gb2312tbm.htm

In fact, these two tables are the same thing, but one is represented by a hexadecimal partition, and the other is represented by a digital location where the location is located. For example, the hexadecimal code of "good" is BA C3, the first two are regions, the last two represent locations, and the BA is in Area 26th, "Good" refers to the 35th Chinese characters in this area, that is, the C3 position, so the digital code is 2635. This is the location principle of gb2312 Chinese characters. According to the Chinese character location code table, we can find that there are no Chinese Characters in Area 15th, that is, area AF, and there are only a few symbols. All Chinese characters start from area 16th, area B0, this is why the gb2312 character set starts from the 16-zone.

2. Analysis of. Net Program Processing Chinese character encoding principles
In. net, system. text can be used to process the encoding of all languages. The system. Text namespace contains many encoding classes for operation and conversion. The encoding class focuses on Chinese character encoding. By querying the encoding class method in the. NET document, we can find that all the methods related to text encoding are byte arrays. There are two useful methods:

The encoding. getbytes () method encodes all or part of the specified string or character array into a byte array.
The encoding. getstring () method decodes the specified byte array into a string.

Yes, we can use these two methods to encode Chinese characters into byte arrays. We also know that the byte array encoding of Chinese characters gb2312 can also decode the byte array into Chinese characters. After the character "good" is encoded as a byte array

Encoding GB = system. Text. encoding. getencoding ("gb2312 ");
Object [] bytes = GB. encoding. getbytes ("good ");

We can find a bytes array with a length of 2.

String lowcode = system. Convert. tostring (Bytes [0], 16); // extract the encoded content of element 1 (two hexadecimal digits)
String hightcode = system. Convert. tostring (Bytes [1], 16); // extracts element 2 encoded content (two hexadecimal digits)

After that, we found that the content after bytes16 hexadecimal bitcode is actually a hexadecimal bitcode (see the bitcode table ).
Therefore, we can randomly generate a hex byte array with a length of 2 and use the getstring () method to decode it to obtain Chinese characters. However, for the generation of Chinese Character verification codes, because no Chinese characters are available in Zone 15th, that is, zone AF, there are only a few symbols, and all Chinese characters start from zone 16th, zone B0, in addition, the Chinese characters starting from location D7 are complex Chinese characters that are hard to see, so these must be removed. Therefore, the 1st-bit random hexadecimal code of Chinese characters is in the range of B, C, and D. If the 1st-bit is d, the 2nd-bit Location Code cannot be a hexadecimal number after 7. Looking at the location code table, we can find that the first and last locations in each area are empty and there are no Chinese characters. Therefore, if the random 3rd-bit location code is a, the 4th-bit value cannot be 0; if the first digit is f, the second digit cannot be f.
Now that you know the principle, the Program for randomly Generating Chinese characters will come out. below is the C # console code for generating four random Chinese characters:

3. program code:

Using system;
Using system. text;

Namespace consoleapplication
{
Class chinesecode
{
Public static void main ()
{
// Obtain the gb2312 encoding page (table)
Encoding GB = encoding. getencoding ("gb2312 ");

// Call the function to generate four random Chinese character codes
Object [] bytes = createregioncode (4 );

// Decodes Chinese Characters Based on the byte array encoded by Chinese Characters
String str1 = GB. getstring (byte []) convert. changetype (Bytes [0], typeof (byte []);
String str2 = GB. getstring (byte []) convert. changetype (Bytes [1], typeof (byte []);
String str3 = GB. getstring (byte []) convert. changetype (Bytes [2], typeof (byte []);
String str4 = GB. getstring (byte []) convert. changetype (Bytes [3], typeof (byte []);

// Output Console
Console. writeline (str1 + str2 + str3 + str4 );
}

/**//*
This function randomly creates a hexadecimal byte array containing two elements within the Chinese character encoding range. Each byte array represents a Chinese character and
The four byte arrays are stored in the object array.
Parameter: strlength, indicating the number of Chinese characters to be generated
*/
Public static object [] createregioncode (INT strlength)
{
// Define a string array to store the components of Chinese character encoding
String [] rbase = new string [16];

Random RND = new random ();

// Define an object array
Object [] bytes = new object [strlength];

/** // * Generates a hexadecimal byte array containing two elements at a time in each loop, and puts it into the bject array.
Each Chinese Character consists of four location codes.
The first element of the byte array is the 1st bits and 2nd bits.
The second element of the byte array is the 3rd bits and 4th bits.
*/
For (INT I = 0; I <strlength; I ++)
{
// Code 1st bits
Int R1 = RND. Next (11,14 );
String str_r1 = rbase [R1]. Trim ();

// Code 2nd bits
RND = new random (R1 * unchecked (INT) datetime. Now. ticks) + I); // Replace

Seed to avoid repeated values
Int R2;
If (r1 = 13)
{
R2 = RND. Next (0, 7 );
}
Else
{
R2 = RND. Next (0, 16 );
}
String str_r2 = rbase [R2]. Trim ();

// Code 3rd bits
RND = new random (R2 * unchecked (INT) datetime. Now. ticks) + I );
Int R3 = RND. Next (10, 16 );
String str_r3 = rbase [R3]. Trim ();

// Code 4th bits
RND = new random (R3 * unchecked (INT) datetime. Now. ticks) + I );
Int R4;
If (R3 = 10)
{
R4 = RND. Next (1, 16 );
}
Else if (R3 = 15)
{
R4 = RND. Next (0, 15 );
}
Else
{
R4 = RND. Next (0, 16 );
}
String str_r4 = rbase [R4]. Trim ();

// Define the random Chinese character location code generated by storing two byte Variables
Byte byte1 = convert. tobyte (str_r1 + str_r2, 16 );
Byte byte2 = convert. tobyte (str_r3 + str_r4, 16 );
// Store two byte variables in the byte array
Byte [] str_r = new byte [];

// Put the byte array of the generated Chinese character into the object Array
Bytes. setvalue (str_r, I );

}

Return bytes;

}
}

}

After the random generation of Chinese characters is realized, you can use. Net GDI to draw the verification code graphics you need. There have been many articles on how to generate Verification Code images and change the length, width, and other effects of the characters. However, it must be noted that the above Code can only run in the Chinese version of Windows, because it has a GB character set. If you are an operating system in other languages, you need to install the GB character set.

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.