MySQL database encoding Overview

Source: Internet
Author: User

When you use databases, there will always be various encoding problems. After reading the MySQL official documentation, we will record some of the MySQL coding system knowledge, for example, where MySQL uses the encoding layer, the encoding involved in the interaction between the MySQL client and the server, and how to specify the encoding.

Basic concepts:

Mysql database encoding layers: system layer, server layer, database layer, table layer, column layer, and client, connection, and result scenarios related to client communication; A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set; To connect to MySQL from Java you have to use the JDBC driver from MySQL. the MySQL JDBC driver is called MySQL Connector/J. mySQL 4.1 and below for unicode support is not good jdbc3.0.16 and above to support the use of the database itself encoding, otherwise use the ISO8859-1 In the mysql console input show variables like 'character _ set _ % '; check the system variables related to the current encoding. Several of them will be parsed later.
     mysql> SHOW VARIABLES LIKE 'character%';+--------------------------+---------------------------------+| Variable_name            | Value                           |+--------------------------+---------------------------------+| character_set_client     | latin1                          || character_set_connection | latin1                          || character_set_database   | latin1                          || character_set_filesystem | binary                          || character_set_results    | latin1                          || character_set_server     | latin1                          || character_set_system     | utf8                            || character_sets_dir       | D:"mysql-5.0.37"share"charsets" |


Mysql supports multi-layer character encoding. 1. Role of the server layer
The default encoding of the entire database server.
Configuration point
Use the character_set_server parameter to specify the server layer encoding.
You can add this parameter when executing mysqld.
You can also set this parameter when editing mysql.
2. Functions of the database layer
Database-level default encoding.
Configuration point
Specify the database layer encoding by using the character_set_database parameter of the system variable.
Specify the encoding when creating a table
3. Similarly, at the table layer, the encoding settings at the table layer only affect the encoding of all unspecified columns in the current table. However, this designation is unique to mysql and can only be specified when creating or modifying a table through SQL. In standard SQL, there is no SQL syntax for specifying table encoding.
4. Role of the column Layer
Sets the column encoding.
Configuration point
Set when creating a table or modifying a column. This is the standard SQL syntax.

How to convert the encoding during interaction between mysql server and client 1. client sending statement character set and collation system variables are involved in handling traffic for the connection between a client and the server. every client has connection-related character set and collation system variables.
The server takes the character_set_client system variable to be the character set in which statements are sent by the client.
When the client communicates with the server, several other encoding settings related to system variables are involved. Each client has its own encoding link encoding.
The server uses the system variable character_set_client to process the statements sent from the client.

2. server processing statement The server uses the character_set_connection and collation_connection system variables. it converts statements sent by the client from character_set_client to character_set_connection (Response t for string literals that have an introducer such as _ latin1 or _ utf8)
The server converts the statements sent from the client (in character_set_client encoding) to character_set_connection encoding.

A character string literal may have an optional character set introducer and COLLATE clause [_ charset_name] 'string' [COLLATE collation_name]
For example, SELECT _ latin1 'string' COLLATE latin1_danish_ci;
If no encoding is specified, character_set_connection is used by default.
The character set used for literals that do not have a character set introducer and for number-to-string conversion.
Character_set_connection encoding is applied to convert text and numbers without leading encoding modifier (introducer) to characters.

For comparisons of strings with column values, collation_connection does not matter because columns have their own collation, which has a higher collation precedence.
When the column fields in the table are compared with the statements sent from the client, the client statements are converted into the corresponding column encoding before comparison, because the column fields have a higher priority.

3. The server returns the character_set_results system variable indicates the character set in which the server returns query results to The client
The character_set_results system variable is used to return data to the client in this encoding mode.

Next we will use a picture to roughly describe the above content (I understand it)


How does the server automatically determine and set the encoding? The character encoding between client and server is automatically detected upon connection. you specify the encoding on the server using the character_set_server for server versions 4.1.0 and newer, and character_set system variable for server versions older than 4.1.0. the driver automatically uses the encoding specified by the server.
If the client does not provide encoding during connection (the connection string does not have characterEncoding), the server uses the character_set_server variable as the client encoding (after 4.1.0 ).

For example, to use 4-byte UTF-8 character sets with Connector/J, configure the MySQL server with character_set_server = utf8mb4, and leave characterEncoding out of the Connector/J connection string. connector/J will then autodetect the UTF-8 setting.

How does the client develop code?
To override the automatically detected encoding on the client side, use the characterEncoding property in the URL used to connect to the server.

When a client connects to the server, it sends the name of the character set that it wants to use. the server uses the name to set the character_set_client, character_set_results, and character_set_connection system variables. in effect, the server performs a set names operation using the character set name.
When a connection is established between the client and the server, the consumer set to be used by the client is sent. The server will use this sequence set to initialize three system variables character_set_client, character_set_results, and character_set_connection. For example, if the set names xxx statement is executed:
Set names xx can specify the connection encoding as xx: character_set_connection, character_set_results, and character_set_client system variables which can be modified;

Set names 'charset _ name' is equivalent to executing the following three statements:
SET character_set_client = charset_name;SET character_set_results = charset_name;SET character_set_connection = charset_name;

A set character set charset_name is equivalent to executing the following three statements:
SET character_set_client = charset_name;SET character_set_results = charset_name;SET collation_connection = @@collation_database;
Reference

Http://dev.mysql.com/doc/refman/5.5/en/charset.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.