Http://blog.donews.com/yllr/archive/2007/01/03/1107388.aspx
MySQL and later versions support multiple character sets, but the default Character Set installation isLatin1 is a Swedish text, so many people want to install discuz and other forums that use GBK, UTF-8, and other character sets. Recently I found some materials on the Internet, and I tested my experience on a Linux server for two days. I basically solved the character set problem. Now I have written my experiences, for the majority of online games reference.
I. Principles:
MySQL servers use character sets in six key locations:Client, connection, database, results, server, and system.
A. The client is the character set used by the client, which is equivalent to the following character set settings on the webpage:
< Meta HTTP-equiv = "Content-Type" Content = "Text/html; charset =UTF-8" >.
B. The connection is the character set setting type of the connected database. If php does not specify the character set type used to connect to the database
It is set according to the default character set on the server.
C. database is the character set used by a database in the database server. If it is not specified during database creation, it will be installed on the server.
Specifies the character set.
D. Results is the character set used by the database to return data to the client. If not specified, use the default Character Set of the server.
E. server is the default character set specified during Server installation.
F and system are character set used by the database system.
System is generally the UTF-8 character set by default, and server is the highest character set. If the database is not set separately, the server characters are used.
Set, others are set according to the server settings. In addition, each table and field in the database also has the character set concept.
The character set is determined based on the upper-level structure. For example, the table determines its character set based on the database settings, and the field determines its own character set based on the table.
Character Set.
Ii. Unified Character Set Method: (take the UTF-8 character set as an example because it is currently the most widely supported Character Set)
1. Complete Solution to character set:
To completely solve the character set problem, MySQL uses the UTF-8 character set during installation, so that the preceding six key
All vertices are encoded in UTF-8.
A. Installing MySql in Windows prompts you to select the character set. You can select UTF-8.
B. There are three installation methods in Linux. The first method is to install the RPM package. I have no say because I have never used it.
The second type is executable. Program This installation has been compiled Because Latin1 is a Swedish Character Set, words cannot be completely resolved.
Character set. We will discuss how to solve the character set problem later in this version.
The third is the self-compiled and installed source code. This installation can set the character set type during compilation. This part mainly describes this installation method.
When compiling MySQL, we can use the following command:./configure -- Default-character-set = utf8
In this way, the MySQL server item will be compiled into utf8 encoding during compilation, so that all databases created in MySQL will use utf8
Encoding storage, all related aspects are UTF-8 encoding.
2. methods for determining the character set:
If your database is installed by default Latin1 character set (strange why MySQL uses this default Character Set)
we can solve this problem.
A. By default, use the status command in the MySQL command line to check the status. The following information is displayed:
mysql> Status
--------------
MySQL ver 14.7 distrib 4.1.9, for PC-Linux-GNU (i686)
connection ID: 62
Current Database:
current user: root @ localhost
SSL: not in use
current Pager: stdout
using OUTFILE: ''
using delimiter:;
server version: 5.1.11-beta-log
Protocol Version: 10
connection: localhost via UNIX socket
server characterset: Latin1
dB characterset: Latin1
client characterset: Latin1
Conn. characterset: Latin1
UNIX socket:/tmp/MySQL. sock
uptime: 58 min 23 sec
threads: 2 questions: 1067 slow queries: 0 opens: 0 flush tables: 1 open tables: 75 queries per second AVG: 0.305
--------------
available show variables like 'character _ SET _ %:
+ ------------------------ + upper +
| variable_name | value |
+ lower +
| character_set_client | Latin1 |
| character_set_connection | Latin1 |
| character_set_database | Latin1 |
| Character_set_filesystem | binary |
| Character_set_results | Latin1 |
| Character_set_server |Latin1 |
| Character_set_system | utf8 |
| Character_sets_dir |/usr/local/src/mysql-5.1.11-beta-linux-i686-glibc23/share/MySQL/charsets/|
+ -------------------------- + ------------------------------------------------------------------------- +
8 rows in SET (0.00 Sec)
This indicates that the character set of all content except system is utf8 is Latin1. How can we use the UTF character set?
Some people say that the default-character-set = utf8 option can be added to the [mysqld] section of the my. CNF file, but after my experiments
It is completely unnecessary. The result is displayed after this item is added.
Mysql> status;
--------------
MySQL ver 14.7 distrib 4.1.9, for PC-Linux-GNU (i686)
Connection ID: 62
Current Database:
Current User: root @ localhost
SSL: not in use
Current Pager: stdout
Using OUTFILE :''
Using delimiter :;
Server version: 5.1.11-beta-Log
Protocol Version: 10
Connection: localhost via UNIX socket
Server characterset: Latin1
DB characterset: utf8
Client characterset: utf8
Conn. characterset: Latin1
UNIX socket:/tmp/MySQL. Sock
Uptime: 1 hour 6 min 18 Sec
Threads: 2 questions: 1071 slow queries: 0 opens: 0 flush tables: 1 open tables: 75 queries per second AVG: 0.269
--------------
Mysql> show variables like 'character _ SET _ % ';
+ -------------------------- + ------------------------------------------------------------------------- +
| Variable_name | value |
+ -------------------------- + ------------------------------------------------------------------------- +
| Character_set_client | Latin1 |
| Character_set_connection | Latin1 |
| Character_set_database | utf8 |
| Character_set_filesystem | binary |
| Character_set_results | Latin1 |
| Character_set_server | utf8 |
| Character_set_system | utf8 |
| Character_sets_dir |/usr/local/src/mysql-5.1.11-beta-linux-i686-glibc23/share/MySQL/charsets/|
+ -------------------------- + ------------------------------------------------------------------------- +
8 rows in SET (0.00 Sec)
Mysql> show variables like 'collation _ % ';
+ ---------------------- + ------------------- +
| Variable_name | value |
+ ---------------------- + ------------------- +
| Collation_connection | latin1_swedish_ci |
| Collation_database | utf8_general_ci |
| Collation_server | utf8_general_ci |
+ ---------------------- + ------------------- +
3 rows in SET (0.01 Sec)
We can see from the above that the method of modifying the default character set in the configuration file is not,
Results is changed to utf8, And we can use this command to set the character set of the database during database creation:
Create Database 'database' default Character Set utf8 collate utf8_general_ci;
In this way, all the tables and fields in the database will be UTF-8 character set encoding, so the modification method in the configuration file will become meaningless.
B,My solution .
We can ignore what is the default Character Set of the database. We only care about whether the database was added to the character set when it was created.
Select.
(1) Use the following command to create a database:
Create Database 'database' default Character Set utf8 collate utf8_general_ci;
(2) The PHP program on the client uses the following method to set the character set used for the connection:
Run the PHP program before querying the database.Mysql_query("Set names utf8;") ;
Example:
- <? PHP
-
- Mysql_connect('Localhost','User','Password');
- Mysql_select_db('My_db');
-
-
-
- // Note that this step is critical. Without this step, all data reads and writes Will be incorrect.
-
- // It sets the default character set for data transmission during the database connection.
- Mysql_query("Set names utf8;");
-
-
-
- // You must convert gb2312 (locally encoded) to UTF-8. You can also use the iconv () function.
- Mysql_query(Mb_convet_encoding("Insert into my_table values ('test ');","UTF-8","Gb2312"));
-
- ?>
Note: cy163 adopts this method (3 ).
(3) if you want to use gb2312 encoding, we recommend that you use Latin1 as the default Character Set of the data table, so that you can directly use the
text to insert data in the command line tool, and can be directly displayed. do not use the gb2312 or GBK character sets. If you are worried about query
sorting and other issues, you can use the Binary Attribute constraints, such:
Create Table my_table (name varchar (20) binary not null default'') type = MyISAM default charset Latin1;
Since then, the complete example of using the utf8 character set has ended.
III,Old Data Upgrade Method
(1)Export Database:
Mysqldump-uroot-p123456 -- default-character-set = Latin1 -- Set-charset = utf8 -- opt olddatabase> newdatabase. SQL
(2)Modify newdatabase. SQL, starting with the fileAdd an SQL statement:"Set names utf8;", Save.
(3) mysql-hlocalhost-uroot my_db <Newdatabase. SQL
Trackback: http://tb.donews.net/TrackBack.aspx? Postid = 1107388