Concept of CMD code page and how to set the value

Source: Internet
Author: User

Chcp
Displays the number of the active console code page, or changes the console's active console code page. Used without parameters, chcp displays the number of the active console code page.
Syntax
Chcp [NNN]
Parameters
Nnn: Specifies the code page. The following table lists each code page supported and its country/region or language:
Code page country/region or language
437 United States
850 multilingual (Latin I)
852 Slavic (Latin II)
855 Cyrillic (Russian)
857 Turkish
860 Portuguese
861 Icelandic
863 Canadian-French
865 Nordic
866 Russian
869 modern Greek
What is code page and how to modify codePage in Windows cmd

If your cmd cannot display Chinese characters or other characters normally, use chcp to modify the parameter. The NNN parameter represents three numbers. The codePage in simplified Chinese is: 936 in Spanish: 1252.

Code page history:

1. codePage definition and history
Charcter Code refers to the internal code used to represent characters. Readers must use the internal code when entering and storing documents. The internal code is divided

Single-byte internal code-single-byte character sets (sbcs), which can be 256 characters encoded.
Double-byte character sets (DBCS), which can be encoded with 65000 characters. It is mainly used to encode the eastern text of a large character set.
CodePage refers to a selected list of character inner codes in a specific order. For earlier single-byte incode languages, the internal code order in codePage allows the system to give an internal code based on the input value of the keyboard according to this list. for the double byte internal code, the corresponding table from multibyte to Unicode is provided, so that the characters stored in the Unicode form can be converted into the corresponding character internal code, or vice versa, in Linux, the corresponding functions are utf8_mbtowc and utf8_wctomb.
Before 1980, there were still no international standards such as ISO-8859 or Unicode to define how to extend US-ASCII encoding for non-English-speaking users. many IT vendors have invented their own codes and used numbers that are hard to remember to identify:

For example, 936 represents Simplified Chinese. 950 represents traditional Chinese.

1.1 CJK codePage
Unlike extended UNIX coding (EUC) encoding, the following code uses the C1 control code {= 80 .. = 9f} is the first byte and uses the ASCII value {= 40 .. = 7E {as the second byte, so that it can contain up to tens of thousands of double-byte characters, which indicates that the ASCII value smaller than 3f in this encoding does not necessarily represent ASCII characters.

Cp932

Shift-JIS contains the Japanese charset JIS x 0201 (one character per byte) and JIS x 0208 (two bytes per character). Therefore, JIS x 0201 hirakana contains one byte half width character, the remaining 60 bytes are used as the first byte of 7076 Chinese characters and 648 other full-width characters. in the same EUC-JP coding area, shift-JIS does not contain the 202 Chinese characters defined in JIS x 5802.

Cp936

GBK extends the EUC-CN encoding (GB 2312-80 encoding, containing 6763 Chinese characters) to the 20902 Chinese characters defined in Unicode (GB13000.1-93), mainland China uses Simplified Chinese zh_cn.

Cp949

Unifiedhangul (uhc) is a superset of Korean EUC-KR encoding (ks c 5601-1992 encoding, including 2350 Korean syllables and 4888 Chinese characters, contains 8822 additional Korean syllables (in C1)

Cp950

Is the big5 encoding (11643 traditional zh_tw text) that replaces the EUC-TW (CNS 1992-13072) in Traditional Chinese, these definitions are found in the CJK. inf of Ken Lunde or in the Unicode encoding table.

Note: Microsoft uses the above four types of codepages. Therefore, the above codePage must be used to access Microsoft's file system.

1.2 IBM Far East language codePage
IBM codePage is divided into sbcs and DBCS:

IBM sbcs codePage

37 (English )*
290 (Japanese )*
833 (Korean )*
836 (Simplified Chinese )*
891 (Korean)
897 (Japanese)
903 (Simplified Chinese)
904 (Traditional Chinese)
Ibm dbcs codePage

300 (Japanese )*
301 (Japanese)
834 (Korean )*
835 (Traditional Chinese )*
837 (Simplified Chinese )*
926 (Korean)
927 (Traditional Chinese)
928 (Simplified Chinese)
Mixing sbcs codePage with DBCS codePage becomes: ibm mbcs codePage

930 (Japanese) (codePage 300 and 290 )*
932 (Japanese) (codePage 301 + 897)
933 (Korean) (codePage 834 and 833 )*
934 (Korean) (codePage 926 and 891)
938 (Traditional Chinese) (codePage 927 + 904)
936 (Simplified Chinese) (codePage 928 + 903)
5031 (Simplified Chinese) (codePage 837 and 836 )*
5033 (Traditional Chinese) (codePage 835 and 37 )*
* Indicates that the ebcdic encoding format is used.

It can be seen that Mircosoft's CJK codePage comes from IBM's codePage

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.