Using the console to write some simple programs is a good choice.
However, in windows, the console cannot output UTF-8 text. Try the following methods by netizens and yourself:
First
Traditional operating system, chcp 950
C # Can't I display simplified Chinese characters when the console program is running? Console. writeline (however, you can enter console. Readline in simplified Chinese characters)
Find the network and find a solution to the problem.
A. Change the current character set to 65001 (utf8)
B. Set the font
C. The interface will be messy, but it can still run normally.
D. Minimize the CMD window and then return to view the restored normal interface.
Second
In the English operating system, chcp is 437, and chcp 950 directly reports an error (unknown reason)
Chcp 65001, And the font cannot be modified. Some squares are displayed.
1. So I had to change the Registry to 950 first.
2. Open cmd again. After chcp, it is found that it is 950.
Then modify the font. At this time, the Chinese characters can be displayed.
Then chcp 65001 (remember to modify the font of the console first, and then chcp)
In this way, you can run the console program to output simplified and traditional Chinese characters.
Okay, so far
Next, let's look at the CMD of chcp950, which has not been modified.
The dir command can display simplified and traditional file names at the same time.
The type command can also display text files containing simplified Chinese characters. However, this file must be stored in unicode format, while UTF-8 storage will display garbled characters.
It means that the console can output Unicode characters without changing any settings, that is, both the simplified and traditional
How does the type and Dir commands call standard output?
Is a question
If you want to study it later, check whether the console. writeline of C # can also provide this function.
The following table describes the default Character Set of Windows:
1. Windows has a default Character Set, also called the internal code page. In Windows, this format is called the ANSI character set.
The ANSI character set varies with operating systems.
For example
936 (gb2312): Simplified Operating System
950 (big5): traditional operating system
437: English Operating System
This character set is compatible with ASCII
The extended part adopts dual-byte
Therefore, Unicode and utf8 cannot be used.
The former is usually displayed in two bytes, and the latter may be displayed in three bytes.
In Windows, the default character set is set in
Control Panel --> region and language options-> advanced --> select a non-Unicode default character set (change this option and restart your computer)
Why is Unicode used all over the world required?
It may be because:
Text Files in gb2312 and big5 formats cannot be distinguished.
Therefore, when Windows encounters such a file, an option is required to indicate the character set used for reading. This is the default internal code page.
Of course, the parser can parse HTML based on the charset of the meta tag, instead of the default character set.
The first line of XML also shows what charset is.
In addition, if it is Unicode (including big endian), UTF-8 has a flag, that is, the beginning of a text file has some special bytes to describe the current file format
2. In Windows, Unicode is used.
For example, the folder name can be simplified or traditional
There is also a program you write. If there is a string variable, this string variable is stored in the EXE in Unicode
UCOS = Unicode Character Set = Unicode
UCS-2: Unicode 16, expressed in 2 bytes, 65536 characters
UCS-4: Unicode 32, 4 bytes full Unicode
2010.10.20 makeup
If it is a console program written in C # and you want to run it in Windows scheduled work, you need to set the following key values