Mysql database character set conversion and version upgrade/downgrade tutorial _ MySQL

Source: Internet
Author: User
Mysql database character set conversion and version upgrade downgrade detailed tutorial this article for mu Yifeng original, original Post address http://club.muzone.cn/viewthread.php? Tid = 28605
Please indicate the source of the post. thank you very much!

Recently, discuz has released a new version, which is free of charge and has more users. Previously, discuz2.5/3.0 was switched or upgraded to discuz4.0 using other forum programs, it can be seen that discuz, as a php Forum program developed by Chinese people, is indeed very good. when everyone is ecstatic, they also encountered some problems.

Many users see garbled characters after conversion. The main reason for this phenomenon is that these users use versions above mysql4.1. the following is a description. I hope all the friends who have this problem can read this document patiently !!!

Since MySQL 4.1, the multi-language support has changed a lot (which leads to problems ). Although MySQL 3 and 4.0 are dominant in most places (including personal use and host providers), MySQL 4.1 and even 5.0 are officially recommended databases by MySQL, there are already host providers starting to provide and will become more and more; because latin1 is used as the default character set in many places (which is detailed below, the success blinded the developers and users of many PHP programs and concealed problems that may occur in Chinese and other language environments.

MySQL 4.1 began to classify the multi-language character sets in more detail, which led to database migration. or, after the dz Forum was upgraded to 4.0 (dz4.0 began to use gbk or UTF-8 encoding), there was a garbled problem.

MySQL 4.1 Character Set Support has two aspects: Character set and Collation ). The support for character sets is refined to four levels: server, database, table, and connection ).

You can run the following two commands to view the character set and sorting method settings of the system:

QUOTE:
Mysql> show variables like 'character _ set _ % ';
+ -------------------------- + ---------------------------- +
| Variable_name | Value |
+ -------------------------- + ---------------------------- +
| Character_set_client | latin1 |
| Character_set_connection | latin1 |
| Character_set_database | latin1 |
| Character_set_results | latin1 |
| Character_set_server | latin1 |
| Character_set_system | utf8 |
| Character_sets_dir |/usr/share/mysql/charsets/|
+ -------------------------- + ---------------------------- +
7 rows in set (0.00 sec)

Mysql> show variables like 'collation _ % ';
+ ---------------------- + ------------------- +
| Variable_name | Value |
+ ---------------------- + ------------------- +
| Collation_connection | latin1_swedish_ci |
| Collation_database | latin1_swedish_ci |
| Collation_server | latin1_swedish_ci |
+ ---------------------- + ------------------- +
3 rows in set (0.00 sec)
MySQL 4.1 specifies the character set to be refined to the MySQL installed on a machine. one database, one table, and one column of the table, which character set should be used. However, traditional Web programs do not use such complex configurations when creating databases and data tables. they use default configurations. So where does the default configuration come from?

A default character set is specified during MySQL Compilation. The character set is latin1;
When installing MySQL, you can specify a default character set in the configuration file (my. ini). if it is not specified, this value is inherited from the value specified during compilation;
When you start mysqld, you can specify a default character set in the command line parameters. if not, this value is inherited from the configuration file;
Character_set_server is set to the default character set;
When creating a new database, unless explicitly specified, the character set of this database is set to character_set_server by default;
When a database is selected, character_set_database is set to the default character set of the database;
When a table is created in this database, the default character set of the table is set to character_set_database, which is the default character set of this database;
When a column is set in the table, unless explicitly specified, the default character set in this column is the default character set of the table;
This character set is the character set used for actual data storage in the database. the content of mysqldump is under this character set;
When we access the MySQL database through PHP in the original way, even if the default character set of the table is set to utf8 and the query is sent through the UTF-8 encoding, you will find that the database is still garbled. The problem lies in the connection layer.
To store "correct" information and obtain "correct" results, it is most convenient to execute the following before all queries start:

Set names 'gbk ';
Gbk is the database character set.

It is equivalent to the following three commands:
SET character_set_client = gbk;
SET character_set_results = gbk;
SET character_set_connection = gbk;

The latin1 character set is used by default in 4.1 and 5.0 (Wood: Mom, foreigners are really Domineering. do you want to make the world use the Swedish character set)
If we only want to use the gbk character set to store and retrieve data,
When compiling mysql 4.1 and 5.0, you must add two parameters in my. ini or my. cnf.

CODE:
[Copy to clipboard]
[Mysqld]
Default-character-set = utf8


CODE:
[Copy to clipboard]
# Settings for clients (connection, results, clients)
[Mysql]
Default-character-set = utf8
Next let's talk about how to convert the database character set
Two methods,

QUOTE:
First ---- Change the storage character set
The main idea is to change the character set latin1 of the database to gbk, big5, or utf8. the following operations must have host permissions. Assume that the database name for the current operation is: database

Export
First, export the data to the format of mysql4.0. the specific command is as follows:
Mysqldump-uroot-p -- default-character-set = latin1 -- set-charset = gbk -- skip-opt databse> d4. SQL

-- Default-characte-set: the character set of the database before. generally, it is latin1,
-- Set-charset: the character set of the data exported. this can be set to gbk, utf8, or big5.
Import
First, use the following statement to create a database with the GBK character set (test)

Create database 'd4 'default character set gbk COLLATE gbk_chinese_ci;
Then, import the exported data to the current database.

Mysql-uroot-p -- default-character-set = gbk-f d4 use the preceding export and import operations to change the database character set to the correct storage method.

D4 is the name of the new database, and d4. SQL is the name of the exported file.

However, this method is depressing to find that the database data storage volume increases by 30% without reason.


QUOTE:
The other method works the same but requires manual operations. it is generally used for selection after the first method fails.
However, this method is difficult if the database is large, because opening the file will cause you to crash.

First, use phpmyadmin or dump of mysql to export the. SQL file.

Then use UltraEdit to open all the xxxx. SQL files you have backed up and find

CODE:
[Copy to clipboard]
Default charset = latin1
Latin1 may be something else. it's something you don't want. it should be converted into a gbk or big5 character set.
Replace this with "null"
Find

CODE:
[Copy to clipboard]
Create table cdb_sessions (
Sid char (6) character set latin1 collate latin1_bin not null default '',
Ip1 tinyint (3) unsigned not null default '0 ',
Ip2 tinyint (3) unsigned not null default '0 ',
Ip3 tinyint (3) unsigned not null default '0 ',
Ip4 tinyint (3) unsigned not null default '0 ',
Uid mediumint (8) unsigned not null default '0 ',
Username char (15) not null default '',
Groupid smallint (6) unsigned not null default '0 ',
Styleid smallint (6) unsigned not null default '0 ',
Invisible tinyint (1) not null default '0 ',
'Action' tinyint (1) unsigned not null default '0 ',
Lastactivity int (10) unsigned not null default '0 ',
Fid smallint (6) unsigned not null default '0 ',
Tid mediumint (8) unsigned not null default '0 ',
Nickname char (15) not null default '',
Unique key sid (sid)
) ENGINE = HEAP MAX_ROWS = 1000;
Replace

CODE:
[Copy to clipboard]
Create table 'cdb _ session '(
'Sid 'char (6) binary not null default '',
'Ip1' tinyint (3) unsigned not null default '0 ',
'Ip2' tinyint (3) unsigned not null default '0 ',
'Ip3' tinyint (3) unsigned not null default '0 ',
'Ip4' tinyint (3) unsigned not null default '0 ',
'Uid' mediumint (8) unsigned not null default '0 ',
'Username' char (15) not null default '',
'Groupid' smallint (6) unsigned not null default '0 ',
'Styleid' smallint (6) unsigned not null default '0 ',
'Invisable' tinyint (1) not null default '0 ',
'Action' tinyint (1) unsigned not null default '0 ',
'Lastactivity' int (10) unsigned not null default '0 ',
'Fid' smallint (6) unsigned not null default '0 ',
'Tid' mediumint (8) unsigned not null default '0 ',
'Nickname' char (15) not null default '',
Unique key 'Sid '('Sid ')
) TYPE = HEAP MAX_ROWS = 2000;
This step is simpler to delete the section about the cdb_sessions table. in the future, we will add a new d4 and export the table.
Copy and paste the content to the end of the SQL file.

Save the SQL file and import it to your database.

OK.
Using these two methods, you can easily downgrade mysql databases of 4.1 and 5.0 to 4.0.
A simple process is
A. export the 4.1/5.0 database
B. convert it to the gbk character set.
C. thoroughly uninstall 4.1 or 5.0
D install 4.0.26
E. import the processed database.

This method can be used to export data when downgrading
Mysqldump-uroot-p -- default-character-set = latin1 -- set-charset = gbk -- skip-opt databse -- compatible = mysql40> d4. SQL
In this way, the exported result is Kohler 4.0.

For mysql version upgrades,
If the data file contains Chinese characters, it is not possible to copy the MySQL 4.0 data file directly to MySQL 4.1, even in my. default-character-set is set to the correct character set in ini. Although it seems that there is no problem, the character set of MySQL 4.1 is very annoying. taking gbk as an example, the original length of MySQL 4.0 data, such as varchar and char, will be half of the original length, in this way, the Chinese storage capacity remains unchanged, and the English storage capacity is half. This is the biggest problem caused by direct copying of data files.

Therefore, if you want to use the correct character set, use mysqldump to export the file and then import it.


Here, by the way, a friend of mine wrote
Used to sort the character set of the forum database for MySQL4.1.

Some of the newly written code may be written in a very concise manner, but it does not affect the usage,


It is mainly used to process and organize character sets of specified databases, tables, and fields in MySQL4.1.

Applies to data structures that are not within the permitted character set range (no data !!) Make it suitable for Discuz! Permitted character set range.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.