Mysql garbled problem _ MySQL

Source: Internet
Author: User
Mysql garbled problem mysql character encoding is introduced in version 4.1 and supports multiple languages, and some features have exceeded other database systems.

Run the following Command under MySQL Command Line Client to view the mysql character set:

Mysql> show character set;
+ ---------- + ----------------------------- + --------------------- + -------- +
| Charset | Description | Default collation | Maxlen |
+ ---------- + ----------------------------- + --------------------- + -------- +
| Big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| Dec8 | DEC West European | dec8_swedish_ci | 1 |
| Cp850 | DOS West European | cp850_general_ci | 1 |
| Hp8 | HP West European | hp8_english_ci | 1 |
| Koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| Latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| Latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| Swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| Ascii | us ascii | ascii_general_ci | 1 |
| Ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |
| Sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |
| Hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| Tis620 | TIS620 Thai | tis620_thai_ci | 1 |
| Euckr | EUC-KR Korean | euckr_korean_ci | 2 |
| Koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| Gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |
| Greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| Cp1250 | Windows Central European | cp1250_general_ci | 1 |
| Gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |
| Latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| Armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| Utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| Ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |
| Cp866 | DOS Russian | cp866_general_ci | 1 |
| Keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 |
| Macce | Mac Central European | macce_general_ci | 1 |
| Macroman | Mac West European | macroman_general_ci | 1 |
| Cp852 | DOS Central European | cp852_general_ci | 1 |
| Latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |
| Cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |
| Cp1256 | Windows Arabic | cp1256_general_ci | 1 |
| Cp1257 | Windows Baltic | cp1257_general_ci | 1 |
| Binary | Binary pseudo charset | binary | 1 |
| Geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 |
| Cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 |
| Eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 |
+ ---------- + ----------------------------- + --------------------- + -------- +
36 rows in set (0.02 sec)



MySQL 4.1 Character Set Support has two aspects: Character set and Collation ). The support for character sets is refined to four levels: server, database, table, and connection ).
You can run the following two commands to view the character set and sorting method settings of the system:

Mysql> show variables like 'character _ set _ % ';
+ -------------------------- + ------------------------------------------- +
| Variable_name | Value |
+ -------------------------- + ------------------------------------------- +
| Character_set_client | latin1 |
| Character_set_connection | latin1 |
| Character_set_database | latin1 |
| Character_set_filesystem | binary |
| Character_set_results | latin1 |
| Character_set_server | latin1 |
| Character_set_system | utf8 |
| Character_sets_dir | D:/MySQL Server 5.0/share/charsets/|
+ -------------------------- + ------------------------------------------- +
8 rows in set (0.06 sec)

Mysql> show variables like 'collation _ % ';
+ ---------------------- + ------------------- +
| Variable_name | Value |
+ ---------------------- + ------------------- +
| Collation_connection | latin1_swedish_ci |
| Collation_database | latin1_swedish_ci |
| Collation_server | latin1_swedish_ci |
+ ---------------------- + ------------------- +
3 rows in set (0.02 sec)

The values listed above are the default values of the system. Latin1 adopts latin1_swedish_ci by default and latin1 adopts the Swedish sorting method. why is latin1_swedish_ci by default? it is easy to trace the mysql history.

In 1979, a Swedish company Tcx wanted to develop a fast multi-thread, multi-user database system. At first, Tcx wanted to use mSQL and their own low-level fast routine (Indexed Sequential Access Method, ISAM) to connect to the database table. However, after some tests, it came to the conclusion: mSQL is not fast and flexible enough for its needs. This produces a new SQL interface for the connector database, which uses almost the same API interface as mSQL. This API is designed to facilitate the migration of third-party code written by mSQL to MySQL.

You can also modify the default character set of mysql.
In the mysql configuration file my. ini, find the following two sentences:

[Mysql]

Default-character-set = latin1

And

# Created and no character set is defined
Default-character-set = latin1

You can modify the following values.

We do not recommend that you change the default value.
That is to say, when mysql is started, if a default character set is not specified, this value is inherited from the configuration file;
Character_set_server is set as the default character set. when a new database is created,
Unless explicitly specified, the character set of this database is set to character_set_server by default. when a database is selected,
Character_set_database is set to the default character set of this database; when a table is created in this database,
The default character set of the table is set to character_set_database, which is the default character set of the database;
When you set a column in a table, unless explicitly specified, the default character set in this column is the default character set of the table.

This problem arises, for example, a database is gbk encoded. If the character set is not specified when you access the database, it is gbk.
This value will inherit the latin1 of the system, so that mysql Chinese garbled characters will be generated.


Garbled solution

To solve the garbled problem, you must first find out the encoding used by the database. If not specified, the default value is latin1.
The most commonly used character sets are gb2312, gbk, and utf8.

How to specify the character set of the database? The following gbk is also used as an example.

[Create a database in MySQL Command Line Client]

Mysql> create table 'mysqlcode '(
-> 'Id' TINYINT (255) unsigned not null AUTO_INCREMENT primary key,
-> 'Content' VARCHAR (255) NOT NULL
->) TYPE = myisam character set gbk COLLATE gbk_chinese_ci;
Query OK, 0 rows affected, 1 warning (0.03 sec)

Mysql> desc mysqlcode;
+ --------- + --------------------- + ------ + ----- + --------- + ------------------ +
| Field | Type | Null | Key | Default | Extra |
+ --------- + --------------------- + ------ + ----- + --------- + ------------------ +
| Id | tinyint (255) unsigned | NO | PRI | auto_increment |
| Content | varchar (255) | NO |
+ --------- + --------------------- + ------ + ----- + --------- + ------------------ +
2 rows in set (0.02 sec)

TYPE = myisam character set gbk COLLATE gbk_chinese_ci;
Is to specify the character set of the database, COLLATE (), so that mysql supports multiple encoding databases at the same time.

You can also use the following command to modify the character set of the database:
Alter database da_name default character set 'charset '.

The client sends data in gbk format. the following configurations can be used:

SET character_set_client = 'gbk'
SET character_set_connection = 'gbk'
SET character_set_results = 'gbk'

This configuration is equivalent to set names 'gbk '.


Perform operations on the database you just created

Mysql> use test;
Database changed

Mysql> insert into mysqlcode values (null, 'php hobby ');
ERROR 1406 (22001): Data too long for column 'content' at row 1

If the character set is not specified as gbk, an error occurs during insertion.

Mysql> set names 'gbk ';
Query OK, 0 rows affected (0.02 sec)

The specified character set is gbk.

Mysql> insert into mysqlcode values (null, 'php hobby ');
Query OK, 1 row affected (0.00 sec)

Inserted successfully

Mysql> select * from mysqlcode;
+ ---- + ----------- +
| Id | content |
+ ---- + ----------- +
| 1 | php hobbies |
+ ---- + ----------- +
1 row in set (0.00 sec)

Garbled characters are also displayed when the character set gbk is not specified.

Mysql> select * from mysqlcode;
+ ---- + --------- +
| Id | content |
+ ---- + --------- +
| 1 | php ??? |
+ ---- + --------- +
1 row in set (0.00 sec)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.