Character encoding in C #

Source: Internet
Author: User

Character encoding in C #

The file is encoded as GB18030, and the width of each line is 23 characters. Column 1-8 is the employee name, and column 10-23 is the salary. Now we need to write a C # program to calculate the average salary of the employees of the unit, as shown below:

1 using System;
2 using System. IO;
3 using System. Text;
4
5 namespace Skyiv. Ben. Test
6 {
7 sealed class Avg
8 {
9 static void Main ()
10 {
11 try
12 {
13 Encoding encode = Encoding. GetEncoding (GB18030 );
14 using (StreamReader sr = new StreamReader(salary.txt, encode ))
15 {
16 decimal avg = 0;
17 long rows = 0;
18 for (; rows ++)
19 {
20 string s = sr. ReadLine ();
21 if (s = null) break;
22 decimal salary = Convert. ToDecimal (s. Substring (9, 14 ));
23 avg + = salary;
24}
25 avg/= rows;
26 Console. WriteLine (avg. ToString (N2 ));
27}
28}
29 catch (Exception ex)
30 {
31 Console. WriteLine (error: + ex. Message );
32}
33}
34}
35}
36

The running result is as follows:
Error: The index and length must reference the position in the string.
Parameter Name: length
After a brief analysis (or using the debug tool), you will know that the program has 22nd errors:
Decimal salary = Convert. ToDecimal (s. Substring (9, 14 ));
In the upper-right corner, the stringcharacter in the CEN is unicode. each fully-rounded Chinese character can only be regarded as one character. The first line of salary.txt contains 20 characters, the second line contains 21 characters, and the third line contains 19 characters, but not up to 23 characters. substring (9, 14) throws an exception. In fact, you only need to change this line to the following statement:
Decimal salary = Convert. ToDecimal (encode. GetString (encode. GetBytes (s), 9, 14 ));
The correct result can be obtained after re-Compiling and then running: 329,218,792.83.
In fact, a better way is to replace line 13-27 of the program with the following statement:

Const int bytesPerRow = 23 + 2;
Encoding encode = Encoding. GetEncoding (GB18030 );
Using (BinaryReader br = new BinaryReader (new FileStream(salary.txt, FileMode. Open )))
{
If (br. BaseStream. Length % bytesPerRow! = 0) throw new Exception (incorrect file length );
Decimal avg = 0;
Long rows = br. BaseStream. Length/bytesPerRow;
For (long I = 0; I <rows; I ++)
{
Byte [] bs = br. ReadBytes (bytesPerRow );
Decimal salary = Convert. ToDecimal (encode. GetString (bs, 9, 14 ));
Avg + = salary;
}
Avg/= rows;
Console. WriteLine (avg. ToString (N2 ));
}

 

Now, assume that our task is to generate salary.txt. Can the following programs work?

1 using System;
2 using System. IO;
3 using System. Text;
4
5 namespace Skyiv. Ben. Test
6 {
7 sealed class Salary
8 {
9 static void Main ()
10 {
11 try
12 {
13 Encoding encode = Encoding. GetEncoding (GB18030 );
14 string [] names = {Li Fugui, Rong Yun, Ouyang Qixue };
15 decimal [] salarys = {0.01 m, 2057.38 m, 987654321.09 m };
16 using (StreamWriter sw = new StreamWriter(salary.txt, false, encode ))
17 {
18 for (int I = 0; I <names. Length; I ++)
19 sw. WriteLine ({0,-8} {1,14: N2}, names [I], salarys [I]);
20}
21}
22 catch (Exception ex)
23 {
24 Console. WriteLine (error: + ex. Message );
25}
26}
27}
28}
29

The running result indicates that the width of each row in the generated file is different. What should we do? You only need to change the 19th rows in the program:
Sw. writeLine ({0} {1, 14: N2}, encode. getString (encode. getBytes (names [I]. padRight (8), 0, 8), salarys [I]);
That's all.

Assume that the code of the salary.txt file is a UTF-16, whether to put the program in
Encoding encode = Encoding. GetEncoding (GB18030 );
Changed:
Encoding encode = Encoding. Unicode;
That's all? This issue is left for readers to think about.

Imagine that, in the near future, character encoding would all use UTF-16 in all operating systems, in addition, a full-width character and a half-width character share the same width when displayed on the screen and printed on the printer (if it is not an equal-width font, the width of A and I at the halfwidth is also different ). In this case, the full angle and the half angle are no longer required (everyone is the same, the full angle is also the half angle, just as programmers in English-speaking countries do not have this problem (for them, there is no full-angle character concept ).

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.