========================================================== ======================================
Supplement 1:
========================================================== ======================================
MySQL character encoding is introduced in version 4.1 and supports multiple languages. In addition, some features have exceeded those of other database systems.
Run the following command under MySQL command line client to view the MySQL character set:
Mysql> show character set;
+ ---------- + ----------------------------- + --------------------- + -------- +
| Charset | description | default collation | maxlen |
+ ---------- + ----------------------------- + --------------------- + -------- +
| Big5 | big5 traditional Chinese | big5_chinese_ci | 2 |
| Dec8 | dec West European | dec8_swedish_ci | 1 |
| Cp850 | dos West European | cp850_general_ci | 1 |
| HP8 | HP West European | hp8_english_ci | 1 |
| Koi8r | KOI8-R relcom Russian | koi8r_general_ci | 1 |
| Latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| Latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| Swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| ASCII | us ASCII | ascii_general_ci | 1 |
| Ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |
| Sjis | shift-JIS Japanese | sjis_japanese_ci | 2 |
| Hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| Tis620 | tis620 Thai | tis620_thai_ci | 1 |
| Euckr | EUC-KR Korean | euckr_korean_ci | 2 |
| Koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| Gb2312 | gb2312 Simplified Chinese | gb2312_chinese_ci | 2 |
| Greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| Cp1250 | Windows Central European | cp1250_general_ci | 1 |
| GBK Simplified Chinese | gbk_chinese_ci | 2 |
| Latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| Armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| Utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| Ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |
| Cp866 | dos Russian | cp866_general_ci | 1 |
| Keybcs2 | dos kamenicky Czech-Slovak | keybcs2_general_ci | 1 |
| Macce | Mac Central European | macce_general_ci | 1 |
| Macroman | Mac West European | macroman_general_ci | 1 |
| Cp852 | dos Central European | cp852_general_ci | 1 |
| Latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |
| Cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |
| Cp1256 | Windows Arabic | cp1256_general_ci | 1 |
| Cp1257 | Windows Baltic | cp1257_general_ci | 1 |
| Binary pseudo charset | binary | 1 |
| Geostd8 | geostd8 Georgian | geostd8_general_ci | 1 |
| Cp932 | sjis for Windows Japanese | cp932_japanese_ci | 2 |
| Eucjpms | ujis for Windows Japanese | eucjpms_japanese_ci | 3 |
+ ---------- + ----------------------------- + --------------------- + -------- +
36 rows in SET (0.02 Sec)
For more information about MySQL character sets, refer to
Http://www.phpfans.net/bbs/viewt... & extra = Page % 3d1
Character Set support has two aspects: Character Set and collation ). The support for character sets is refined to four levels: Server, database, table, and connection ).
You can run the following two commands to view the character set and sorting method settings of the system:
Mysql> show variables like 'character _ SET _ % ';
+ -------------------------- + ------------------------------------------- +
| Variable_name | value |
+ -------------------------- + ------------------------------------------- +
| Character_set_client | Latin1 |
| Character_set_connection | Latin1 |
| Character_set_database | Latin1 |
| Character_set_filesystem | binary |
| Character_set_results | Latin1 |
| Character_set_server | Latin1 |
| Character_set_system | utf8 |
| Character_sets_dir | D: \ mysql \ MySQL Server 5.0 \ share \ charsets \ |
+ -------------------------- + ------------------------------------------- +
8 rows in SET (0.06 Sec)
Mysql> show variables like 'collation _ % ';
+ ---------------------- + ------------------- +
| Variable_name | value |
+ ---------------------- + ------------------- +
| Collation_connection | latin1_swedish_ci |
| Collation_database | latin1_swedish_ci |
| Collation_server | latin1_swedish_ci |
+ ---------------------- + ------------------- +
3 rows in SET (0.02 Sec)
========================================================== ======================================
Supplement 2: MySQL character set encoding
--------------------------------------------------------------
Original address: http://www.phpfans.net/bbs/viewthread.php? Tid = 296 & extra = Page % 3d1
========================================================== ======================================
MySQL character set encoding
Character Set and collation
Sort description
Armscii8 (ARMSCII-8 Armenian)
Armscii8_bin Armenia, binary
Armscii8_general_ci Armenia, case insensitive
ASCII (us ascii)
Ascii_bin Western Europe (multi-language), binary
Ascii_general_ci Western Europe (multi-language), case insensitive
Big5 (big5 Traditional Chinese)
Big5_bin traditional Chinese, binary
Big5_chinese_ci traditional Chinese, case insensitive
Binary (Binary pseudo charset)
Binary
Cp1250 (Windows Central European)
Cp1250_bin Central Europe (multi-language), binary
Cp1250_croatian_ci slang, case insensitive
Cp1250_czech_cs Czech, case sensitive
Cp1250_general_ci Central Europe (multi-language), case insensitive
Cp1251 (Windows Cyrillic)
Cp1251_bin Spanish (multi-language), binary
Cp1251_bulgarian_ci, not case sensitive
Cp1251_general_ci Spanish (multi-language), case insensitive
Cp1251_general_cs Spanish (multi-language), case sensitive
Cp1251_ukrainian_ci Ukrainian, case insensitive
Cp1256 (Windows Arabic)
Cp1256_bin Arabic, binary
Cp1256_general_ci Arabic, case insensitive
Cp1257 (Windows Baltic)
Cp1257_bin Library (multi-language), binary
Cp1257_general_ci (multi-language), case insensitive
Cp1257_lithuanian_ci Lithuania, case insensitive
Cp850 (DOS West European)
Cp850_bin Western Europe (multi-language), binary
Cp850_general_ci Western Europe (multi-language), case insensitive
Cp852 (DOS Central European)
Cp852_bin Central Europe (multi-language), binary
Cp852_general_ci Central Europe (multi-language), case insensitive
Cp866 (DOS Russian)
Cp866_bin Russian, binary
Cp866_general_ci in Russian, case insensitive
Cp932 (sjis for Windows Japanese)
Cp932_bin Japanese, binary
Cp932_japanese_ci Japanese, case insensitive
Dec8 (Dec West European)
Dec8_bin Western Europe (multi-language), binary
Dec8_swedish_ci Swedish, case insensitive
Eucjpms (ujis for Windows Japanese)
Eucjpms_bin Japanese, binary
Eucjpms_japanese_ci Japanese, case insensitive
Euckr (EUC-KR)
Euckr_bin Korean, binary
Euckr_korean_ci (Korean), case insensitive
Gb2312 (gb2312 Simplified Chinese)
Gb2312_bin Simplified Chinese, binary
Gb2312_chinese_ci (Simplified Chinese), case insensitive
GBK (GBK Simplified Chinese)
Gbk_bin Simplified Chinese, binary
Gbk_chinese_ci (Simplified Chinese), case insensitive
Geostd8 (geostd8 Georgian)
Geostd8_bin Georgia, binary
Geostd8_general_ci Georgia, case insensitive
Greek (ISO 8859-7 Greek)
Greek_bin Greek, binary
Greek_general_ci Greek, case-insensitive
Hebrew (ISO 8859-8 Hebrew)
Hebrew_bin Hebrew, binary
Hebrew_general_ci Hebrew, case insensitive
HP8 (HP West European)
Hp8_bin Western Europe (multi-language), binary
Hp8_english_ci English, case insensitive
Keybcs2 (DOS kamenicky Czech-Slovak)
Keybcs2_bin Czech Slovak, binary
Keybcs2_general_ci in the Czech Republic, case insensitive
Koi8r (KOI8-R relcom Russian)
Koi8r_bin Russian, binary
Koi8r_general_ci in Russian, case insensitive
Koi8u (KOI8-U Ukrainian)
Koi8u_bin Ukrainian, binary
Koi8u_general_ci Ukrainian, case insensitive
Latin1 (cp1252 West European)
Latin1_bin Western Europe (multi-language), binary
Latin1_danish_ci Danish, case insensitive
Latin1_general_ci Western Europe (multi-language), case insensitive
Latin1_general_cs Western Europe (multi-language), case sensitive
Latinw.germanw.ci (dictionary), case insensitive
Latinstmgerman2_ci German (Phone Book), case insensitive
Latin?spanish_ci Spanish, case insensitive
Latin1_swedish_ci Swedish, case insensitive
Sort description
Latin2 (ISO 8859-2 Central European)
Latin2_bin Central Europe (multi-language), binary
Latin2_croatian_ci, case insensitive
Latin2_czech_cs Czech, case sensitive
Latin2_general_ci Central Europe (multi-language), case insensitive
Latin2_hungarian_ci in Hungary, case insensitive
Latin5 (ISO 8859-9 Turkish)
Latin5_bin Turkish, binary
Latin5_turkish_ci Turkish, case insensitive
Latin7 (ISO 8859-13 Baltic)
Latin7_bin (multilingual), binary
Latin7_estonian_cs, case sensitive
Latin7_general_ci (multilingual), case insensitive
Latin7_general_cs (multi-language), case sensitive
Macce (MAC Central European)
Macce_bin Central Europe (multi-language), binary
Macce_general_ci Central Europe (multi-language), case insensitive
Macroman (MAC West European)
Macroman_bin Western Europe (multi-language), binary
Macroman_general_ci Western Europe (multi-language), case insensitive
Sjis (shift-JIS Japan)
Sjis_bin Japanese, binary
Sjis_japanese_ci Japanese, case-insensitive
Swe7 (7bit Swedish)
Swe7_bin Swedish, binary
Swe7_swedish_ci Swedish, case insensitive
Tis620 (tis620 Thai)
Tis620_bin Thai, binary
Tis620_thai_ci Thai, case insensitive
Ucs2 (UCS-2 Unicode)
Ucs2_bin Unicode (multi-language), binary
Ucs2_czech_ci Czech, case-insensitive
Ucs2_danish_ci Danish, case insensitive
Ucs2_esperanto_ci is unknown and case-insensitive
Ucs2_estonian_ci is case insensitive.
Ucs2_general_ci Unicode (multi-language), case insensitive
Ucs2_hungarian_ci Hungary, case-insensitive
Ucs2_icelandic_ci (Case Insensitive)
Ucs2_latvian_ci is case-insensitive.
Ucs2_lithuanian_ci Lithuania, case insensitive
Ucs2_persian_ci Persian, case insensitive
Ucs2_polish_ci polish, case-insensitive
Ucs2_roman_ci Western Europe, case insensitive
Ucs2_romanian_ci Romanian, case insensitive
Ucs2_slovak_ci (not case sensitive)
Ucs2_slovenian_ci is case insensitive.
Ucs2_spanish2_ci traditional Spanish, case insensitive
Ucs2_spanish_ci Spanish, case insensitive
Ucs2_swedish_ci Swedish, case insensitive
Ucs2_turkish_ci Turkish, case insensitive
Ucs2_unicode_ci Unicode (multi-language), case insensitive
Ujis (EUC-JP Japan)
Ujis_bin Japanese, binary
Ujis_japanese_ci Japanese, case-insensitive
Utf8( UTF-8 Unicode)
Utf8_bin Unicode (multi-language), binary
Utf8_czech_ci Czech, case-insensitive
Utf8_danish_ci Danish, case insensitive
Utf8_esperanto_ci unknown, case insensitive
Utf8_estonian_ci, Which is case insensitive
Utf8_general_ci Unicode (multi-language), case insensitive
Utf8_hungarian_ci (Case Insensitive)
Utf8_icelandic_ci (Case Insensitive)
Utf8_latvian_ci, case-insensitive
Utf8_lithuanian_ci Lithuania, case insensitive
Utf8_persian_ci Persian, case insensitive
Utf8_polish_ci polish, case-insensitive
Utf8_roman_ci Western Europe, case insensitive
Utf8_romanian_ci (Case Insensitive)
Utf8_slovak_ci (not case sensitive)
Utf8_slovenian_ci, Which is case insensitive
Utf8_spanish2_ci traditional Spanish, case insensitive
Utf8_spanish_ci (not case sensitive)
Utf8_swedish_ci Swedish, case insensitive
Utf8_turkish_ci Turkish, case insensitive
Utf8_unicode_ci Unicode (multi-language), case insensitive