A thorough study of database garbled characters in MySQL and later versions

Source: Internet
Author: User
Many users see garbled characters After conversion. The main reason for this phenomenon is that these users use versions above mysql4.1. the following is a description. I hope all the friends who have this problem can read this document patiently !!!
Principle
Note: This document is only valid for MySQL 4.1 and later versions. The previous MySQL version does not provide complete support for character sets, so this problem does not exist.

Since MySQL 4.1, the multi-language support has changed a lot (which leads to problems ). Although MySQL 3 and 4.0 are dominant in most places (including personal use and host providers), MySQL 4.1 is a database officially recommended by MySQL, there are already host providers starting to provide and will become more and more; Because Latin1 is used as the default character set in many places (which is detailed below, the success blinded the developers and users of many PHP programs and concealed problems that may occur in Chinese and other language environments.

MySQL 4.1 specifies the character set to be refined to the MySQL installed on a machine. One database, one table, and one column of the table, which character set should be used. However, traditional web programs do not use such complex configurations when creating databases and data tables. They use default configurations. So where does the default configuration come from?

A default character set is specified during MySQL compilation. the character set is Latin1;
When installing MySQL, you can specify a default character set in the configuration file (My. INI). If it is not specified, this value is inherited from the value specified during compilation;
When you start mysqld, you can specify a default character set in the command line parameters. If not, this value is inherited from the configuration file;
Character_set_server is set to the default character set;
When creating a new database, unless explicitly specified, the character set of this database is set to character_set_server by default;
When a database is selected, character_set_database is set to the default Character Set of the database;
When a table is created in this database, the default Character Set of the table is set to character_set_database, which is the default Character Set of this database;
When a column is set in the table, unless explicitly specified, the default character set in this column is the default Character Set of the table;
This character set is the character set used for actual data storage in the database. The content of mysqldump is under this character set;
To store "correct" information and obtain "correct" results, it is most convenient to execute the following before all queries start:

Set names 'gbk ';
GBK is the database character set.

FAQs
My data uses Latin1 or other encoding to store Chinese information, but Chinese Characters in phpMyAdmin are garbled
This problem occurs because the new version of phpMyAdmin forcibly uses the correct character set for database connection and display. Therefore, if the storage internal code is inconsistent with the actual internal code, phpMyAdmin cannot be identified. For Simplified Chinese, phpMyAdmin can recognize GBK/utf8, traditional Chinese, and big5/utf8. If you are sure you want to use this "Incorrect" character set (in fact, before MySQL 4.1, we usually use an "Incorrect" character set to store data) to store data on the Chinese forum, use phpMyAdmin 2.5.x to connect to the database.
My forum used discuz! 4.0.0 RC version + MySQL 4.1 is correct, but garbled characters are generated after the upgrade to the official version.
Before browsing this question, please take a look at the answer to the previous question. Your situation is similar to the above. The RC version uses the "oldest and most common method" to connect to the database. Therefore, if you use an "Incorrect" Character Set for storage, it is actually okay, but discuz! The official version 4.0.0 uses the same "correct" database character set as the new version of phpMyAdmin. As a result, the original "Incorrect" Storage conflicts with the "correct" connection, causing garbled characters.

There are two solutions to solve this problem:

Change storage Character Set
The main idea is to change the character set Latin1 of the database to GBK, big5, or utf8. The following operations must have host permissions. Assume that the database name for the current operation is: Database

Export
First, export the data to the format of mysql4.0. The specific command is as follows: mysqldump-uroot-p -- default-character-set = Latin1 -- Set-charset = GBK -- skip-opt Databse> test. SQL

-- Default-characte-set: the character set of the database before. Generally, it is Latin1,
-- Set-charset: the character set of the data exported. This can be set to GBK, utf8, or big5.
Import
First, use the following statement to create a database with the GBK character set (test)

Create Database 'test' default Character Set GBK collate gbk_chinese_ci;
Then, import the exported data to the current database.

Mysql-uroot-p -- default-character-set = GBK-F test <test. SQL
After the preceding export and import operations, you can change the database character set to the correct storage method.

Summary: This solution is troublesome, but it has always been used for "correct" MySQL storage and data connection, and the new version of phpMyAdmin will not be garbled.

Change connection mode
Discuz! 4.0.0
For discuz! 4.0.0 official version, you can find./include/db_mysql.class.php, Set

Mysql_query ("set names '". str_replace ('-', '', $ globals ['charset'])."' ");
Add "//" to comment it out.

Discuz! 4.0.0 +
For discuz! Versions later than 4.0.0 are supported in config. inc. use a separate $ dbcharset in PHP to set the database character set. Therefore, you can leave it blank based on your actual situation (the same as the $ charset setting ), or specify a specific database character set (such as Latin1)

Conclusion: compromise plan. The data is stored with an "Incorrect" Internal code, but the display and usage are normal. The new version of phpMyAdmin is garbled and the old version is available. Special attention should be paid to character set issues during backup and recovery.

How should I upgrade MySQL 4.0 data to MySQL 4.1 +?
If the data file contains Chinese characters, it is not possible to copy the MySQL 4.0 data file directly to MySQL 4.1, even in my. default-character-set is set to the correct character set in ini. Although it seems that there is no problem, the character set of MySQL 4.1 is very annoying. Taking GBK as an example, the original length of MySQL 4.0 data, such as varchar and char, will be half of the original length, in this way, the Chinese storage capacity remains unchanged, and the English storage capacity is half. This is the biggest problem caused by direct copying of data files.

Therefore, if you want to use the correct character set, use mysqldump to export the file and then import it.

If Latin1 is used in MySQL 4.1 and you want to continue using Latin1 with "Incorrect", you only need to set default-character-set to Latin1, you can change the connection mode in the forum. In this case, you can directly copy the data file.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.