4.2.3 Standard Code

Source: Internet
Author: User
Tags sorted by name

Python has a number of coded character set processing, some of which are implemented in C , and some using a dictionary mapping approach. The following table is a list of character sets sorted by name, some of which can be other names, such as utf-8 can also be found using the name utf_8 . the CPython Implementation has some differences from other implementations and is optimized for some coded character sets, which may be slower if you use a character set other than these character sets. Optimized character sets:utf-8, UTF8, latin-1, latin1, Iso-8859-1, MBCS (Windows only), ASCII, utf-16, and utf-32. Some character sets support different languages, and there are separate character sets.

Codec

Aliases

Languages

Ascii

646, Us-ascii

中文版

Big5

BIG5-TW, Csbig5

Traditional Chinese

Big5hkscs

Big5-hkscs, Hkscs

Traditional Chinese

cp037

IBM037, IBM039

中文版

cp273

273, IBM273, csIBM273

German

New in version 3.4.

cp424

Ebcdic-cp-he, IBM424

Hebrew

cp437

437, IBM437

中文版

cp500

ebcdic-cp-be, Ebcdic-cp-ch, IBM500

Western Europe

cp720

Arabic

cp737

Greek

cp775

IBM775

Baltic languages

cp850

850, IBM850

Western Europe

cp852

852, IBM852

Central and Eastern Europe

cp855

855, IBM855

Bulgarian, Byelorussian, Macedonian, Russian, Serbian

cp856

Hebrew

cp857

857, IBM857

Turkish

cp858

858, IBM858

Western Europe

cp860

860, IBM860

Portuguese

cp861

861, Cp-is, IBM861

Icelandic

cp862

862, IBM862

Hebrew

cp863

863, IBM863

Canadian

cp864

IBM864

Arabic

cp865

865, IBM865

Danish, Norwegian

cp866

866, IBM866

Russian

cp869

869, Cp-gr, IBM869

Greek

cp874

Thai

cp875

Greek

cp932

932, ms932, Mskanji, Ms-kanji

Japanese

cp949

949, ms949, UHC

Korean

cp950

950, ms950

Traditional Chinese

cp1006

Urdu

cp1026

ibm1026

Turkish

cp1125

1125, ibm1125, cp866u, Ruscii

Ukrainian

New in version 3.4.

cp1140

ibm1140

Western Europe

cp1250

windows-1250

Central and Eastern Europe

cp1251

windows-1251

Bulgarian, Byelorussian, Macedonian, Russian, Serbian

cp1252

windows-1252

Western Europe

cp1253

windows-1253

Greek

cp1254

windows-1254

Turkish

cp1255

windows-1255

Hebrew

cp1256

windows-1256

Arabic

cp1257

windows-1257

Baltic languages

cp1258

windows-1258

Vietnamese

cp65001

Windows only:windows UTF-8 (Cp_utf8)

New in version 3.3.

Euc_jp

EUCJP, Ujis, U-jis

Japanese

euc_jis_2004

jisx0213, eucjis2004

Japanese

euc_jisx0213

eucjisx0213

Japanese

Euc_kr

Euckr, Korean, ksc5601, ks_c-5601, ks_c-5601-1987, ksx1001, ks_x-1001

Korean

gb2312

Chinese, csiso58gb231280, EUC-CN, EUCCN, EUCGB2312-CN, gb2312-1980, gb2312-80, iso-ir-58

Simplified Chinese

Gbk

936, cp936, ms936

Unified Chinese

Gb18030

gb18030-2000

Unified Chinese

Hz

HZGB, HZ-GB, hz-gb-2312

Simplified Chinese

Iso2022_jp

CSISO2022JP, ISO2022JP, ISO-2022-JP

Japanese

Iso2022_jp_1

Iso2022jp-1, Iso-2022-jp-1

Japanese

Iso2022_jp_2

Iso2022jp-2, Iso-2022-jp-2

Japanese, Korean, Simplified Chinese, Western Europe, Greek

iso2022_jp_2004

iso2022jp-2004, iso-2022-jp-2004

Japanese

Iso2022_jp_3

Iso2022jp-3, iso-2022-jp-3

Japanese

Iso2022_jp_ext

Iso2022jp-ext, Iso-2022-jp-ext

Japanese

Iso2022_kr

CSISO2022KR, ISO2022KR, ISO-2022-KR

Korean

Latin_1

Iso-8859-1, Iso8859-1, 8859, cp819, Latin, Latin1, L1

West Europe

Iso8859_2

Iso-8859-2, Latin2, L2

Central and Eastern Europe

Iso8859_3

Iso-8859-3, Latin3, L3

Esperanto, Maltese

Iso8859_4

Iso-8859-4, Latin4, L4

Baltic languages

Iso8859_5

Iso-8859-5, Cyrillic

Bulgarian, Byelorussian, Macedonian, Russian, Serbian

Iso8859_6

Iso-8859-6, Arabic

Arabic

Iso8859_7

Iso-8859-7, Greek, Greek8

Greek

Iso8859_8

Iso-8859-8, Hebrew

Hebrew

Iso8859_9

Iso-8859-9, Latin5, L5

Turkish

Iso8859_10

Iso-8859-10, Latin6, L6

Nordic languages

Iso8859_13

ISO-8859-13, latin7, L7

Baltic languages

Iso8859_14

Iso-8859-14, Latin8, L8

Celtic languages

Iso8859_15

Iso-8859-15, Latin9, L9

Western Europe

Iso8859_16

Iso-8859-16, Latin10, L10

South-eastern Europe

Johab

cp1361, ms1361

Korean

Koi8_r

Russian

Koi8_u

Ukrainian

Mac_cyrillic

Maccyrillic

Bulgarian, Byelorussian, Macedonian, Russian, Serbian

Mac_greek

Macgreek

Greek

Mac_iceland

Maciceland

Icelandic

Mac_latin2

Maclatin2, Maccentraleurope

Central and Eastern Europe

Mac_roman

Macroman, Macintosh

Western Europe

Mac_turkish

Macturkish

Turkish

ptcp154

csptcp154, pt154, cp154, Cyrillic-asian

Kazakh

Shift_JIS

Csshiftjis, ShiftJIS, Sjis, S_jis

Japanese

shift_jis_2004

shiftjis2004, sjis_2004, sjis2004

Japanese

shift_jisx0213

shiftjisx0213, sjisx0213, s_jisx0213

Japanese

Utf_32

U32, UTF32

All languages

Utf_32_be

Utf-32be

All languages

Utf_32_le

Utf-32le

All languages

Utf_16

U16, UTF16

All languages

Utf_16_be

Utf-16be

All languages

Utf_16_le

Utf-16le

All languages

Utf_7

U7, unicode-1-1-utf-7

All languages

Utf_8

U8, UTF, UTF8

All languages

Utf_8_sig

All languages


Cai Junsheng qq:9073204 Shenzhen

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

4.2.3 Standard Code

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.