MySQL character set topic (character set, verification, garbled characters) _ MySQL

Source: Internet
Author: User
MySQL character set topic (character set, verification, garbled code) 1 MySQL character set overview

MySQL server supports multiple character sets. different character sets can be specified for different fields in the same server, database, or even Table. compared with other database management systems such as oracle, only the same character set can be used in the same database. MySQL obviously has more flexibility.
MySQL CHARACTER set includes two concepts: CHARACTER Set (CHARACTER) and COLLATION. Character set is used to define the way MySQL stores strings. The collation defines the way strings are compared to solve the sorting and character grouping problems. Character sets and verification rules are one-to-multiple relationships. each character set corresponds to at least one Verification rule. MySQL supports nearly 200 verification rules for 39 character sets.

In MySQL, the concept and encoding scheme of character sets are considered as synonyms. a character set is a combination of a conversion table and an encoding scheme.

Unicode (Universal Code) is a character encoding used on a computer. Unicode is generated to address the limitations of traditional character encoding schemes. it sets a uniform and unique binary encoding for each character in each language, to meet the requirements of cross-language and cross-platform text conversion and processing. Unicode has different encoding schemes, including Utf-8, Utf-16, and Utf-32. Utf indicates Unicode Transformation Format.

2. View character sets and check

2.1 View character sets

mysql> show character set;mysql> select *  from information_schema.character_sets;mysql> select character_set_name, default_collate_name, description, maxlen  from information_schema.character_sets;
The output result is as follows:

+----------+-----------------------------+---------------------+--------+| Charset  | Description                 | Default collation   | Maxlen |+----------+-----------------------------+---------------------+--------+| big5     | Big5 Traditional Chinese    | big5_chinese_ci     |      2 || dec8     | DEC West European           | dec8_swedish_ci     |      1 || cp850    | DOS West European           | cp850_general_ci    |      1 || hp8      | HP West European            | hp8_english_ci      |      1 || koi8r    | KOI8-R Relcom Russian       | koi8r_general_ci    |      1 || latin1   | cp1252 West European        | latin1_swedish_ci   |      1 || latin2   | ISO 8859-2 Central European | latin2_general_ci   |      1 || swe7     | 7bit Swedish                | swe7_swedish_ci     |      1 || ascii    | US ASCII                    | ascii_general_ci    |      1 || ujis     | EUC-JP Japanese             | ujis_japanese_ci    |      3 || sjis     | Shift-JIS Japanese          | sjis_japanese_ci    |      2 || hebrew   | ISO 8859-8 Hebrew           | hebrew_general_ci   |      1 || tis620   | TIS620 Thai                 | tis620_thai_ci      |      1 || euckr    | EUC-KR Korean               | euckr_korean_ci     |      2 || koi8u    | KOI8-U Ukrainian            | koi8u_general_ci    |      1 || gb2312   | GB2312 Simplified Chinese   | gb2312_chinese_ci   |      2 || greek    | ISO 8859-7 Greek            | greek_general_ci    |      1 || cp1250   | Windows Central European    | cp1250_general_ci   |      1 || gbk      | GBK Simplified Chinese      | gbk_chinese_ci      |      2 || latin5   | ISO 8859-9 Turkish          | latin5_turkish_ci   |      1 || armscii8 | ARMSCII-8 Armenian          | armscii8_general_ci |      1 || utf8     | UTF-8 Unicode               | utf8_general_ci     |      3 || ucs2     | UCS-2 Unicode               | ucs2_general_ci     |      2 || cp866    | DOS Russian                 | cp866_general_ci    |      1 || keybcs2  | DOS Kamenicky Czech-Slovak  | keybcs2_general_ci  |      1 || macce    | Mac Central European        | macce_general_ci    |      1 || macroman | Mac West European           | macroman_general_ci |      1 || cp852    | DOS Central European        | cp852_general_ci    |      1 || latin7   | ISO 8859-13 Baltic          | latin7_general_ci   |      1 || utf8mb4  | UTF-8 Unicode               | utf8mb4_general_ci  |      4 || cp1251   | Windows Cyrillic            | cp1251_general_ci   |      1 || utf16    | UTF-16 Unicode              | utf16_general_ci    |      4 || cp1256   | Windows Arabic              | cp1256_general_ci   |      1 || cp1257   | Windows Baltic              | cp1257_general_ci   |      1 || utf32    | UTF-32 Unicode              | utf32_general_ci    |      4 || binary   | Binary pseudo charset       | binary              |      1 || geostd8  | GEOSTD8 Georgian            | geostd8_general_ci  |      1 || cp932    | SJIS for Windows Japanese   | cp932_japanese_ci   |      2 || eucjpms  | UJIS for Windows Japanese   | eucjpms_japanese_ci |      3 |+----------+-----------------------------+---------------------+--------+

Where:


<关键字:字符集>
Reprinted please indicate the source: http://blog.csdn.net/jesseyoung/article/details/36427677

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.