Go deep into Mysql Character Set settings

Source: Internet
Author: User
Tags mysql client

There is a character set converter between the mysql client and the mysql server.

Character_set_client => gbk: the converter knows that the client sends the gbk encoding.

Character_set_connection => gbk: Convert the data transmitted from the client to the gbk format.

Character_set_results => gbk:

Note:Set names gbk can be used for the preceding three character sets.

Example:

Create table test (

Name varchar (64) NOT NULL

) Charset utf8; # utf8 indicates the character encoding of the server.

First, insert a data entry to the test table.

Inert into test values ('test ');

Then, the data "test" is saved in the database in the format of "utf8"

Process:

First, the data is sent to the mysql server through the Mysql client. When the character_set_connection value is gbk, the data sent from the client is converted to gbk format, when the character set converter transmits the data to the server, it finds that the server saves the data by utf8. Therefore, the data is automatically converted from gbk to utf8 within the server.

When Will garbled characters appear?

    Client data format and declaredCharacter_set_client does not match

Use the header ('content-type: text/html; charset = utf8'); to convert the client data to utf8 format. When the data passes through the "Character Set converter, because character_set_client = gbk and character_set_connection is equal to gbk, the data transmitted from the client (in fact, in utf8 format) will not be converted.

However, when the character set converter sends data to the server, it finds that the server requires utf8 format, so the current data will be processed as gbk format, and then converted to utf8 (however, this step is actually wrong ...).

2. When the result does not match the client page

Set the format of the returned result to utf8, but the format accepted by the client is gbk, so garbled characters will appear.

Show character set syntax can be used to display all available character sets

Latin character set

Note: The Maxlen column displays the maximum number of bytes used to store a single character.

Utf8 Character Set

Gbk character set

When Will data be lost?

Compared with the above three images, we can know that in each character set, the maximum number of bytes used to store a character is different, utf8 is the maximum, and latin is the smallest. Therefore, when a character set converter is used, improper processing may cause data loss and cannot be recovered.

For example:

SetWhen the character_set_connection value is changed to lantin

The gbk data sent from the client is converted to the lantin1 format, because the gbk format occupies a large number of characters, resulting in data loss.

Summary:

Character_set_client and character_set_results must be consistent in general, because one represents the data format sent by the client, and the other represents the data format accepted by the client to avoid data loss, the character_set_connection character encoding must be greater than the character_set_client character encoding.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.