View, set, and modify the MySQL Character Set

Source: Internet
Author: User
I have been plagued by MySQL character sets some time ago. I would like to summarize this knowledge today. MySQL character set support (CharacterSetSupport) has two aspects: Character Set and Collation ). The support for character sets is refined to four levels: server, database, and table

I have been plagued by MySQL character sets some time ago. I would like to summarize this knowledge today. MySQL Character Set Support has two aspects: Character set and Collation ). The support for character sets is refined to four levels: server, database, and table

I have been plagued by MySQL character sets some time ago. I would like to summarize this knowledge today.

MySQL Character Set Support has two aspects: Character set and Collation ). The support for character sets is refined to four levels: server, database, table, and connection ).

1. Default MySQL Character Set

MySQL can refine the character set designation to a database, a table, and a column. However, traditional programs do not use such complex configurations when creating databases and data tables. They use the default configuration. So where does the default configuration come from?

  1. A default character set is specified during MySQL compilation. the character set is latin1;
  2. When installing MySQL, you can specify a default character set in the configuration file (my. ini). If it is not specified, this value is inherited from the value specified during compilation;
  3. When starting mysqld, you can specify a default character set in the command line parameters. If this parameter is not specified, the value inherits from the configuration in the configuration file. In this case, character_set_server is set to the default character set;
  4. When creating a new database, unless explicitly specified, the character set of this database is set to character_set_server by default;
  5. When a database is selected, character_set_database is set to the default Character Set of the database;
  6. When a table is created in this database, the default Character Set of the table is set to character_set_database, which is the default Character Set of this database;
  7. When a column is set in the table, unless explicitly specified, the default character set in this column is the default Character Set of the table;

Simply put, if you do not modify anything, all the columns of all tables in all databases will be stored in latin1. However, if you install MySQL, you will generally choose multi-language support, that is, the installer automatically sets default_character_set in the configuration file as a UTF-8, which ensures that all columns of all tables in all databases are stored in UTF-8 by default.

2. View default character sets

By default, the mysql character set is latin1 (ISO_8859_1 ). Generally, you can run the following two commands to view the character set and sorting method of the system:

mysql> SHOW VARIABLES LIKE 'character%';+--------------------------+----------------------------+| Variable_name            | Value                      |+--------------------------+----------------------------+| character_set_client     | utf8                       || character_set_connection | utf8                       || character_set_database   | latin1                     || character_set_filesystem | binary                     || character_set_results    | utf8                       || character_set_server     | latin1                     || character_set_system     | utf8                       || character_sets_dir       | /usr/share/mysql/charsets/ |+--------------------------+----------------------------+8 rows in set
mysql> SHOW VARIABLES LIKE 'collation_%';+----------------------+-------------------+| Variable_name        | Value             |+----------------------+-------------------+| collation_connection | utf8_general_ci   || collation_database   | latin1_swedish_ci || collation_server     | latin1_swedish_ci |+----------------------+-------------------+3 rows in set
3. Modify the default Character Set

The simplest modification method is to modify the character set key value in mysql's my. ini file, for example:

default-character-set = utf8character_set_server =  utf8

After modification, restart the mysql service.

Use mysql> show variables like 'character % '; Check that the database encoding has been changed to utf8.

mysql> SHOW VARIABLES LIKE 'character%';+--------------------------+----------------------------+| Variable_name            | Value                      |+--------------------------+----------------------------+| character_set_client     | utf8                       || character_set_connection | utf8                       || character_set_database   | utf8                     || character_set_filesystem | binary                     || character_set_results    | utf8                       || character_set_server     | utf8                     || character_set_system     | utf8                       || character_sets_dir       | /usr/share/mysql/charsets/ |+--------------------------+----------------------------+8 rows in set

Another way to modify the character set is to use the mysql command:

mysql> SET character_set_client = utf8 ;mysql> SET character_set_connection = utf8 ;mysql> SET character_set_database = utf8 ;mysql> SET character_set_results = utf8 ;mysql> SET character_set_server = utf8 ;mysql> SET collation_connection = utf8 ;mysql> SET collation_database = utf8 ;mysql> SET collation_server = utf8 ;

Because the character_set_client variable is not the character set used for receiving display, only character_set_results is used for display. Therefore, you need to split it into two variables.

Character set is a set of symbols and their corresponding encoding; collation is a set of rules that define how to compare characters (size ). Each character set corresponds to a group (at least one) of collation, and each collation corresponds to the only character set. Generally, the two of them need to appear in pairs and have completed related operations in the database, such as sorting and string connection.

At the above four levels, character set and collation are all set by default. The default settings for the server layer are latin1 and latin1_swedish_ci. (ci: case insensible ). When creating entities at each level, corresponding clauses or candidate items can be used to explicitly declare the character set and verification set to be used by each internship.

Generally, even if the default Character Set of the table is set to utf8 and the query is sent through the UTF-8 encoding, you will find that the database is still garbled. The problem lies in the connection layer. The solution is to execute the following sentence before sending the query: set names 'utf8 ';

It is equivalent to the following three commands:

SET character_set_client = utf8;SET character_set_results = utf8;SET character_set_connection = utf8;
  • Character_set_client: Character Set of the text sent from the client
  • Character_set_results: character set used for the result sent to the client
  • Character_set_connection: character set for connection

Character_set_client and character_set_connection are used only to ensure the consistency with character_set_database encoding, while character_set_results is used to ensure that the returned results of SELECT are consistent with the encoding of the program.

For example, if your database (character_set_database) uses the utf8 character set, you must ensure that character_set_client and character_set_connection are also the utf8 character set. However, your program may not use utf8. For example, if your program uses gbk, you may encounter garbled characters if you set character_set_results to utf8. In this case, set character_set_results to gbk. This ensures that the database returns the same result as the encoding of your program.

Note the following:

  • Ensure that the data stored in the database is consistent with the database encoding, that is, the data encoding is consistent with character_set_database;
  • Ensure that the character sets for communications are consistent with those for databases, that is, character_set_client and character_set_connection are consistent with character_set_database;
  • Make sure that the return value of SELECT is the same as the code of the program, that is, character_set_results is consistent with the program code;
  • Ensure that the program code is consistent with the browser code, that is, the program code and .
Summary

Therefore, what database version is used, whether it is 3.x, 4.0.x or 4.1.x, is not important to us. There are two important points:

  1. Set the database encoding correctly. Character sets of versions earlier than MySQL4.0 are always the default ISO8859-1, and MySQL4.1 will let you choose when installing. If you are going to use UTF-8, you need to specify the UTF-8 when creating the database (you can also change it after the creation, 4.1 or later versions can also separately specify the character set of the table)
  2. Set the database connection encoding correctly. After the database encoding is set, you should specify the connection encoding when connecting to the database. For example, when using jdbc connection, specify the connection as utf8.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.