MySQL Character set garbled

Source: Internet
Author: User

What is a character set?

A character set is a set of symbols and coded rules that can be imagined as bits and symbols for conversion tables.


MySQL-supported character sets

MySQL database can support multiple character sets.

The MySQL character set includes two concepts: the character set (character set) and the proofing Rules (collation). Character sets are used to define how MySQL stores strings, and proofing rules are used to define how strings are compared. Character sets and proofing rules are one-to-many relationships.

Each character set corresponds to at least one proofing rule.

# View supported character sets mysql> show CHARACTER set;# View supported proofing rules mysql> show COLLATION;


The MySQL character set is very flexible and can be set at the server, database, table, and field levels , respectively.

The character set of the database object is specified with the following inheritance relationship:

Table, Column, Server, Database

That is, if the specified character set is not displayed at a certain level, the upper-level character set is inherited.

# View server character set-related system variables mysql> SHOW VARIABLES like '%char% ';

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/57/0F/wKioL1SQVebgKuIHAANaVutXzkI147.jpg "title=" 34.png "alt=" Wkiol1sqvebgkuihaanavutxzki147.jpg "/>


Garbled problem?

When MySQL processes the connection, the SQL requests sent by the external connection are converted according to the following order:
character_set_client //The character set used by the customer connection
|
character_set_connection //mysql Connection Character Set
|
character_set_database //The character set used by the database (table, column)
|
Character_set_results//The character set used for the returned result

    1. What encoding do we tell the server that the data I send to you? character_set_client

    2. If you find that the encoding specified by the connector is inconsistent, why is the encoding being converted? character_set_connection

    3. What code is used for the results of the query? Character_set_results

If the above three are character set N, can be abbreviated as: set names N;

Below is an example to illustrate why garbled?

#  Create a table mysql> create table person  (    -> id int  NOT NULL AUTO_INCREMENT PRIMARY KEY,    -> name  VARCHAR (    -> )  DEFAULT CHARSET utf8;#  then perform the following 3 commands,  can be shortened to  set names utf8;set character_set_client=utf8;set character_set_connection=utf8; set_character_set_results=utf8;#  View current character Set Settings mysql> show variables like  '%char% '; +--- -----------------------+----------------------------+| variable_name             | Value                       |+--------------------------+--------- -------------------+| character_set_client     | utf8                         | |  character_set_connection | utf8                        | |  character_set_database   | latin1                      | |  character_set_filesystem | binary                      | |  character_set_results    | utf8                        | |  character_set_server     | latin1                      | |  character_set_system     | utf8                        | |  character_sets_dir       | /usr/share/mysql/charsets/ |+--- -----------------------+----------------------------+#  then insert a piece of data mysql> insert into  Person (name)  values (' Chinese ');#  query mysql> select * from person;+----+--------+|  id | name   |+----+--------+|  1 |  Chinese    |+-- --+--------+#  shows that the results are normal. #  good, Next, we change character_set_results  for  gbkmysql> set character_set_results=gbk; query ok, 0 rows affected  (0.04 sec) mysql> select * from  person;+----+------+| id | name |+----+------+|&NBsp; 1 |      |+----+------+ 

The root cause of garbled characters is:

The system variables of each character set are inconsistent, resulting in character set conversion.


For example, the client does not correctly set the client character set, causing the original SQL statement to be converted to the character set connection, which may be lost information, if the client is in UTF8 format, then if it is converted to gb2312 format, There is a certain loss of information, which is not lost. Make sure that the connection character set is larger than the client character set to guarantee that the conversion does not lose information. Similarly, connection and results are the same. just as the data types in a programming language convert to each other, such as casting a double type to an int type, it can result in the loss of precision.

Let's take a closer look at the conversion process:

    • When inserting UTF8 encoded data into a data table that has the default character set of UTF8, the connection character set is set to UTF8 when queried.
      – The Character_set_client, Character_set_connection, and Character_set_results are latin1 when plugged in according to the MySQL server's default settings;
      – The data for the insert operation will undergo the LATIN1=>LATIN1=>UTF8 character set conversion process, in which each inserted kanji will be saved from the original 3 bytes to 6 bytes;
      – The results of the query will undergo the UTF8=>UTF8 character set conversion process, returning the saved 6 bytes intact, resulting in garbled characters.

    • The connection character set is UTF8 before inserting UTF8 encoded data into a data table with the default character set of Latin1
      – Character_set_client, Character_set_connection, and character_set_results are all UTF8 when inserted according to the connection character set;
      --the insertion data will be converted to a utf8=>utf8=>latin1 character set, and if the original data contains Unicode characters outside the \U0000~\U00FF range, it will be converted to "?" because it cannot be represented in the latin1 character set. (0x3f) symbol, which cannot be restored at a later time, regardless of the connection character set settings.

In conclusion, the ultimate solution is as follows:
1. First of all to clarify what encoding format your client, this is the most important (IE6 general use UTF8, command line is generally GBK, general program is gb2312)
2. Make sure your database uses the UTF8 format, very simple, all coding-all.
3. Make sure that the connection character set is greater than or equal to the client character set, otherwise the information will be lost, such as: latin1 < gb2312 < GBK < UTF8, if set character_set_ Client = gb2312, then at least connection character set is greater than or equal to gb2312, otherwise the information will be lost
4. The above three steps do the correct words, then all Chinese is correctly converted into UTF8 format stored in the database, in order to adapt to different browsers, different clients, you can modify the Character_set_results to display the Chinese font in different encodings, because UTF8 is the general direction, So the web app is I still prefer to use the UTF8 format to display Chinese.


Summarize
According to the above analysis and suggestions, we solve the problem we should use what method we should be more clear in mind. Yes, when creating the database, specify the character set, do not go through the modification of the default configuration to achieve the purpose, of course you can also use the form of the specified table character set, but it is easy to omit, especially when many people are involved in the design, more prone to fault.



Http://www.cnblogs.com/zzwlovegfj/archive/2012/06/25/2560649.html

Http://www.cnblogs.com/discuss/articles/1862248.html


This article is from the "Share Your Knowledge" blog, so be sure to keep this source http://skypegnu1.blog.51cto.com/8991766/1591215

MySQL Character set garbled

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.