The concept of CMD CODE page and its setting value use Method _dos/bat

Source: Internet
Author: User

Chcp
Displays the number of the active console code page, or changes the console ' s active console code page. Used without parameters, CHCP displays the number of the active console code page.
Syntax
chcp [nnn]
Parameters
Nnn:specifies the code page. The following table lists each code page supported and its country/region or language:
Code page country/region or language
437 United States
850 multilingual (Latin I)
852 Slavic (Latin II)
855 Cyrillic (Russian)
857 Turkish
860 Portuguese
861 Icelandic
863 Canadian-french
865 Nordic
866 Russian
869 Modern Greek
What is code page, how to modify the codepage in Windows cmd


If your cmd does not display Chinese properly, or other characters, by Chcp to modify, the parameter is nnn representing 3 digits. The codepage of Simplified Chinese is: 936 Latin is: 1252


History of the Code page:



1. Definition and History of codepage
The character inner code (charcter code) refers to the inner code used to represent the character. The reader uses the inner code when entering and storing the document, and the inner code is divided into

Single-byte inner code--Single-byte character sets (SBCS), can support 256 character encodings.
Double-byte internal code--Double-byte character sets) (DBCS) supports 65,000 character encodings. It is mainly used to encode the oriental characters of large character sets.
CodePage refers to a selected list of characters in a particular order, in the early Single-byte language, the codepage order allows the system to follow this list to give a corresponding inner code according to the input values of the keyboard. For double-byte inner Code, The corresponding table multibyte to Unicode is given so that the characters stored in Unicode can be converted into corresponding character codes, or vice versa, the corresponding function in the Linux core is UTF8_MBTOWC and Utf8_wctomb.
1980 years ago, there were still no international standards such as ISO-8859 or Unicode to define how to extend the US-ASCII encoding for use by users in non-English-speaking countries. Many IT vendors invent their own code and use a number that is hard to remember to identify:



For example, 936 represents simplified Chinese. 950 represents traditional Chinese.



1.1 CJK Codepage
Unlike the Extended Unix coding (EUC) code, all of the following Far East codepage use the C1 control code {=80..=9f} as the leading byte, using the ASCII value {=40..=7e {as the second byte, so as to contain more Up to tens of thousands of double-byte characters, indicating that ASCII values less than 3F in this encoding do not necessarily represent ASCII characters.

CP932

Shift-jis contains Japanese charset JIS x 0201 (one byte per character) and JIS X 0208 (two bytes per character), so JIS x 0201 hiragana contains a byte half width character, and its remaining 60 bytes are used as 7,076 Chinese characters and 648 The first byte of the other full width character. The difference with EUC-JP encoding is that Shift-jis does not contain the 5,802 characters defined in JIS X 202.

CP936

GBK extends the EUC-CN encoding (GB 2312-80 encoded, contains 6,763 Chinese characters) to the 20,902 characters defined in Unicode (gb13000.1-93), and the Chinese mainland uses the Simplified Chinese zh_cn.

CP949

Unifiedhangul (UHC) is a superset of the Korean EUC-KR code (KS C 5601-1992 code, including 2350 Korean syllables and 4,888 kanji a), containing 8,822 additional Korean syllables (in C1)

CP950

is the replacement of EUC-TW (CNS 11643-1992) BIG5 encoding (13072 traditional ZH_TW Chinese characters), which are defined in the CJK Lunde of Ken. INF or Unicode encoded table.

Note: Microsoft uses the above four kinds of codepage, so it is necessary to use the above codepage to access the Microsoft file system.


1.2 IBM's Far East language codepage
IBM's codepage is divided into SBCS and DBCS two kinds:

IBM SBCS Codepage


37 (English) *
290 (Japanese) *
833 (Korean) *
836 (Simplified Chinese) *
891 (Korean)
897 (Japanese)
903 (Simplified Chinese)
904 (Traditional Chinese)
IBM DBCS Codepage

300 (Japanese) *
301 (Japanese)
834 (Korean) *
835 (Traditional Chinese) *
837 (Simplified Chinese) *
926 (Korean)
927 (Traditional Chinese)
928 (Simplified Chinese)
Combine SBCs Codepage and DBCS Codepage to become: IBM MBCS Codepage

930 (Japanese) (Codepage 300 plus 290) *
932 (Japanese) (Codepage 301 plus 897)
933 (Korean) (Codepage 834 plus 833) *
934 (Korean) (Codepage 926 plus 891)
938 (Traditional Chinese) (Codepage 927 plus 904)
936 (Simplified Chinese) (Codepage 928 plus 903)
5031 (Simplified Chinese) (Codepage 837 plus 836) *
5033 (Traditional Chinese) (Codepage 835 plus 37) *
* Represents the use of EBCDIC coding format

Thus, Mircosoft's CJK codepage originated from IBM's codepage

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.