The basic principle of generating random Chinese character verification code with C #

Source: Internet
Author: User
Tags character set datetime range resource tostring trim
Chinese Characters | random | verification Code | CHINESE

A few days ago to apply for free QQ number, suddenly found in the application form of the verification code content replaced by Chinese, this shouting call me surprised to feel funny, moper on the cat on the big scold Tencent use Chinese authentication code. ^_^
I have to admire Tencent in order to prevent the current network rampant QQ number automatic registration machine and take the means of Chinese verification code. Think about it. It is not difficult to generate a random Chinese verification code using a program, and here's how to generate random Chinese characters in C #.


1, encoding principle
How exactly does the random generation of Chinese characters? Where does the Chinese character come from? Is there a background data table, which stores all the required Chinese characters, using the program randomly take out a few Chinese characters combination on the line? Use the background database to first save all Chinese characters to use randomly out, this is a way, but there are so many Chinese characters, how to make it? In fact, you can use the program to do all this without using any background database. To know how to generate Chinese characters, we must first understand the encoding principle of Chinese characters.
1980, in order to make every Chinese character has a national unified Code, China promulgated the first encoding national standard: gb2312-80 "information exchange with Chinese character coded character set" Basic set, referred to as GB2312, this character set is the development basis of China's Chinese processing technology, It is also the unified standard of all Chinese character system in China. It was later announced that the national standard gb18030-2000 "information exchange with Chinese character coded character set of the expansion of basic set", referred to as GB18030, programming, if it involves coding and localization of friends should be familiar with GB18030. This is our country after gb2312-1980 and gb13000-1993 the most important encoding standards, but also the future of our computer systems must be followed by one of the basic standards.
Currently in the Chinese Windows operating system,. NET programming, the default code page is GB18030 Simplified Chinese. But in fact it is sufficient to generate Chinese character verification code only if you need to use the GB2312 character set. In addition to the characters we all know, the character set also contains a lot of Chinese characters that we don't know and seldom see at ordinary times. If the generation of Chinese character verification code has a lot of Chinese characters we do not know to let us enter, for the use of Pinyin input method of friends is not a good thing, Wubi users can also barely according to the appearance of Chinese characters to play out, hehe! So we don't all use Chinese characters in the GB2312 character set.
Chinese characters can be expressed by using location codes, see

Chinese Character Location Code table http://navicy2005.home4u.china.com/resource/gb2312tbl.htm
Chinese character Location Code Code table http://navicy2005.home4u.china.com/resource/gb2312tbm.htm

In fact, the two tables are the same thing, except that one uses a hexadecimal partition to represent the numeric position where the location is used. For example, the "good" character of the hexadecimal location code is a BA C3, the first two is the region, the latter two representative position, BA in the 26th District, "Good" in this area of the Chinese characters 35th is C3 position, so the number code is 2635. This is the GB2312 Chinese character location principle. According to the "Chinese Character Location Code table" We can find that the 15th area is the AF area before the Chinese characters, only a small number of symbols, Chinese characters are from the 16th B0 start, which is why the GB2312 character set from the 16 area began.

2,. NET program processing encoding principle analysis
System.Text can be used in. NET to process the encoding of all languages. The System.Text namespace contains a number of encoded classes that can be manipulated and converted. The encoding class is the class that focuses on handling encoding. Passed in. NET document to query the encoding class we can see that all text encoding is a byte array, and there are two very useful methods:

The Encoding.GetBytes () method encodes all or part of the specified String or character array as a byte array
The Encoding.getstring () method decodes the specified byte array into a string.


Yes, we can encode Chinese characters into byte arrays by these two methods, and also know the byte array encoding of Chinese character GB2312 can decode byte array into Chinese characters. After the word "good" is encoded as a byte array

Encoding gb=system.text.encoding.getencoding ("gb2312");
Object[] BYTES=GB. Encoding.GetBytes ("good");


Found a byte array of length 2 bytes, using the

String lowcode = System.Convert.ToString (bytes[0], 16); remove element 1 encoded content (two-bit 16)
String hightcode = System.Convert.ToString (bytes[1], 16)//fetching element 2 encoded content (two-bit 16)


After the discovery of the byte array bytes16 into the code after the content unexpectedly is {ba,c3}, just the "good" character of the hexadecimal location code (see Location Code table).
So we can randomly generate a hexadecimal byte array of length 2, using the GetString () method to decode it to get the character of Chinese characters. However, for the generation of Chinese character verification code, because the 15th area is the AF area before the Chinese characters, only a small number of symbols, Chinese characters from the 16th District B0 began, and from the location of the D7 after the beginning of the Chinese characters are difficult to see, so these are to be expelled. Therefore, the randomly generated hexadecimal location code 1th digit range between B, C, D, if the 1th digit is D, the 2nd bit Location code can not be a hexadecimal number after 7. To see the Location Code table found that the first position and the last position are empty, there is no Chinese characters, so the randomly generated location code 3rd if a, the 4th digit can not be 0; 3rd if it is F, the 4th bit can not be f.
Well, know the principle, the random generation of Chinese characters in the program is out, the following is the generation of 4 random Chinese characters of the C # console code:


3, program code:

Using System;
Using System.Text;

Namespace Consoleapplication
{
Class Chinesecode
{
public static void Main ()
{
Get GB2312 Encoding page (table)
Encoding gb=encoding.getencoding ("gb2312");

Call function generates 4 random Chinese encoding
Object[] Bytes=createregioncode (4);

Decoding Chinese characters according to the encoding byte array
String STR1=GB. GetString ((byte[]) Convert.changetype (Bytes[0], typeof (byte[)));
String STR2=GB. GetString ((byte[]) Convert.changetype (Bytes[1], typeof (byte[)));
String STR3=GB. GetString ((byte[]) Convert.changetype (bytes[2], typeof (byte[)));
String STR4=GB. GetString ((byte[]) Convert.changetype (Bytes[3], typeof (byte[)));

Console for output
Console.WriteLine (str1 + str2 +str3 +STR4);
}


/**//*
This function randomly creates a hexadecimal byte array with two elements in the range of encoding, each byte array represents a character, and the
A four-byte array is stored in an object array.
Parameters: Strlength, representing the number of characters to be produced
*/
public static object[] Createregioncode (int strlength)
{
Defines a string array that stores the constituent elements of the encoding
string[] rbase=new String [16]{"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "B", "C", "D", "E", "F"};

Random rnd=new Random ();

Defines an object array to
Object[] Bytes=new object[strlength];

/**//* produces a hexadecimal byte array of two elements at a time, and puts it in a bject array
Each Chinese character is composed of four location codes.
Location Code 1th and location code 2nd bit as the first element of a byte array
Location Code 3rd and Location Code 4th digits as the second element of a byte array
*/
for (int i=0;i<strlength;i++)
{
Location Code 1th Place
int R1=rnd. Next (11,14);
String STR_R1=RBASE[R1]. Trim ();

Location Code 2nd place
Rnd=new Random (r1*unchecked (int) DateTime.Now.Ticks) +i);//replace the random number generator

Seeds Avoid duplicate values
int R2;
if (r1==13)
{
R2=rnd. Next (0,7);
}
Else
{
R2=rnd. Next (0,16);
}
String STR_R2=RBASE[R2]. Trim ();

Location Code 3rd place
Rnd=new Random (r2*unchecked (int) DateTime.Now.Ticks) +i);
int R3=rnd. Next (10,16);
String STR_R3=RBASE[R3]. Trim ();

Location Code 4th place
Rnd=new Random (r3*unchecked (int) DateTime.Now.Ticks) +i);
int R4;
if (r3==10)
{
R4=rnd. Next (1,16);
}
else if (r3==15)
{
R4=rnd. Next (0,15);
}
Else
{
R4=rnd. Next (0,16);
}
String STR_R4=RBASE[R4]. Trim ();

Defines a random Chinese character location code generated by two byte variable storage
Byte Byte1=convert.tobyte (str_r1 + str_r2,16);
Byte Byte2=convert.tobyte (STR_R3 + str_r4,16);
Storing two byte variables in a byte array
Byte[] Str_r=new Byte[]{byte1,byte2};

Puts a byte array of the resulting Chinese character into an object array
bytes. SetValue (Str_r,i);

}

return bytes;

}
}

}


After the random generation of Chinese characters is realized, the. NET GDI can be used to draw the verification code graphics that you need. How to generate the verification code picture, as well as changing the length and width of the characters in the Internet has been a lot of related articles, here due to the length of the article is no longer introduced. However, one thing to note is that the above code runs under the Chinese version of Windows, because it has a GB character set, and if you are an operating system in another language, you need to install the GB character set.




Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.