Summary of MySQL character set adjustment

Source: Internet
Author: User
Tags mysql command line

(To http://blog.donews.com/yllr/archive/2007/01/03/1107388.aspx)

MySQL and later versions support multi-character sets, but the default Character Set installed is Latin1 in Swedish, so many people want to install discuz and other forums that use GBK, UTF-8, and other character sets. Recently I found some materials on the Internet, and I tested my experience on a Linux server for two days. I basically solved the character set problem. Now I have written my experiences, for the majority of online games reference.

I. Principles:

MySQL servers use character sets in six key locations: client, connection, database, results, server, and system.

A. The client is the character set used by the client, which is equivalent to the following character set settings on the webpage:

<Meta http-equiv = "Content-Type" content = "text/html; charset = UTF-8">.
B. The connection is the character set setting type of the connected database. If php does not specify the character set type used to connect to the database
It is set according to the default character set on the server.
C. database is the character set used by a database in the database server. If it is not specified during database creation, it will be installed on the server.
Specifies the character set.
D. Results is the character set used by the database to return data to the client. If not specified, use the default Character Set of the server.
E. server is the default character set specified during Server installation.
F and system are character set used by the database system.

System is generally the UTF-8 character set by default, and server is the highest character set. If the database is not set separately, the server characters are used.
Set, others are set according to the server settings. In addition, each table and field in the database also has the character set concept.
The character set is determined based on the upper-level structure. For example, the table determines its character set based on the database settings, and the field determines its own character set based on the table.
Character Set.

Ii. Unified Character Set Method: (take the UTF-8 character set as an example because it is currently the most widely supported Character Set)

1. Complete Solution to character set:

To completely solve the character set problem, MySQL uses the UTF-8 character set during installation, so that the preceding six key
All vertices are encoded in UTF-8.

A. Installing MySql in Windows prompts you to select the character set. You can select UTF-8.

B. There are three installation methods in Linux. The first method is to install the RPM package. I have no say because I have never used it.

The second type is executable program installation. This installation is unable to completely resolve the word because it has been compiled into the Swedish Character Set Latin1.
Character set. We will discuss how to solve the character set problem later in this version.
The third is the self-compiled and installed source code. This installation can set the character set type during compilation. This part mainly describes this installation method.

In MySQL compilation, we can use the following command:./configure -- default-character-set = utf8
In this way, the MySQL server item will be compiled into utf8 encoding during compilation, so that all databases created in MySQL will use utf8
Encoding storage, all related aspects are UTF-8 encoding.


2. methods for determining the character set:
 
If your database uses the Latin1 character set installed by default (it is strange why MySQL uses this default Character Set)
We can solve this problem.

A. By default, use the status command in the MySQL command line to check the status. The following information is displayed:

Mysql> Status
--------------
MySQL ver 14.7 distrib 4.1.9, for PC-Linux-GNU (i686)

Connection ID: 62
Current Database:
Current User: root @ localhost
SSL: not in use
Current Pager: stdout
Using OUTFILE :''
Using delimiter :;
Server version: 5.1.11-beta-Log
Protocol Version: 10
Connection: localhost via UNIX socket
Server characterset: Latin1
DB characterset: Latin1
Client characterset: Latin1
Conn. characterset: Latin1
UNIX socket:/tmp/MySQL. Sock
Uptime: 58 min 23 sec

Threads: 2 questions: 1067 slow queries: 0 opens: 0 flush tables: 1 open tables: 75 queries per second AVG: 0.305
--------------

You can also use the show variables like 'character _ SET _ % 'command to view the following content:

+ -------------------------- + ------------------------------------------------------------------------- +
| Variable_name | value |
+ -------------------------- + ------------------------------------------------------------------------- +
| Character_set_client | Latin1 |
| Character_set_connection | Latin1 |
| Character_set_database | Latin1 |
| Character_set_filesystem | binary |
| Character_set_results | Latin1 |
| Character_set_server | Latin1 |
| Character_set_system | utf8 |
| Character_sets_dir |/usr/local/src/mysql-5.1.11-beta-linux-i686-glibc23/share/MySQL/charsets/|
+ -------------------------- + ------------------------------------------------------------------------- +
8 rows in SET (0.00 Sec)

This indicates that the character set of all content except system is utf8 is Latin1. How can we use the UTF character set?
Some people say that the default-character-set = utf8 option can be added to the [mysqld] section of the my. CNF file, but after my experiments
It is completely unnecessary. The result is displayed after this item is added.

Mysql> status;
--------------
MySQL ver 14.7 distrib 4.1.9, for PC-Linux-GNU (i686)

Connection ID: 62
Current Database:
Current User: root @ localhost
SSL: not in use
Current Pager: stdout
Using OUTFILE :''
Using delimiter :;
Server version: 5.1.11-beta-Log
Protocol Version: 10
Connection: localhost via UNIX socket
Server characterset: Latin1
DB characterset: utf8
Client characterset: utf8
Conn. characterset: Latin1
UNIX socket:/tmp/MySQL. Sock
Uptime: 1 hour 6 min 18 Sec

Threads: 2 questions: 1071 slow queries: 0 opens: 0 flush tables: 1 open tables: 75 queries per second AVG: 0.269
--------------

Mysql> show variables like 'character _ SET _ % ';
+ -------------------------- + ------------------------------------------------------------------------- +
| Variable_name | value |
+ -------------------------- + ------------------------------------------------------------------------- +
| Character_set_client | Latin1 |
| Character_set_connection | Latin1 |
| Character_set_database | utf8 |
| Character_set_filesystem | binary |
| Character_set_results | Latin1 |
| Character_set_server | utf8 |
| Character_set_system | utf8 |
| Character_sets_dir |/usr/local/src/mysql-5.1.11-beta-linux-i686-glibc23/share/MySQL/charsets/|
+ -------------------------- + ------------------------------------------------------------------------- +
8 rows in SET (0.00 Sec)


Mysql> show variables like 'collation _ % ';
+ ---------------------- + ------------------- +
| Variable_name | value |
+ ---------------------- + ------------------- +
| Collation_connection | latin1_swedish_ci |
| Collation_database | utf8_general_ci |
| Collation_server | utf8_general_ci |
+ ---------------------- + ------------------- +
3 rows in SET (0.01 Sec)


We can see from the above that the method of modifying the default character set in the configuration file is not,
Results is changed to utf8, And we can use this command to set the character set of the database during database creation:

 CREATE DATABASE `database` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

In this way, all the tables and fields in the database will be UTF-8 character set encoding, so the modification method in the configuration file will become meaningless.


B. My solutions.

We can ignore what is the default Character Set of the database. We only care about whether the database was added to the character set when it was created.
Select.

(1) Use the following command to create a database:


Create Database 'database' default Character Set utf8 collate utf8_general_ci;

(2) The PHP program on the client uses the following method to set the character set used for the connection:
Before the PHP program queries the database, execute mysql_query ("set names utf8 ;");

Example:
  1. <? PHP
  2. Mysql_connect ('localhost', 'user', 'Password ');
  3. Mysql_select_db ('My _ db ');
  4.  
  5. // Note that this step is critical. Without this step, all data reads and writes Will be incorrect.
  6. // It sets the default character set for data transmission during the database connection.
  7. Mysql_query ("set names utf8 ;");
  8.  
  9. // You must convert gb2312 (locally encoded) to UTF-8. You can also use the iconv () function.
  10. Mysql_query (mb_convet_encoding ("insert into my_table values ('test');", "UTF-8", "gb2312 "));
  11. ?>
(3) If you want to use gb2312 encoding, we recommend that you use Latin1 as the default Character Set of the data table so that you can directly use
Insert data in the command line tool and display it directly. Do not use the gb2312 or GBK character sets.
For sorting and other issues, you can use the Binary Attribute constraints, such:
Create Table my_table (name varchar (20) binary not null default '') type = MyISAM default charset Latin1;
Since then, the complete example of using the utf8 character set has ended.

III,Old Data Upgrade Method

(1)
Export Database:

Mysqldump-uroot-p123456 -- default-character-set = Latin1 -- Set-charset = utf8 -- opt olddatabase> newdatabase. SQL

(2) Modify newdatabase. SQL and add an SQL statement at the beginning of the file:"Set names utf8;", Save.

(3) mysql-hlocalhost-uroot my_db <newdatabase. SQL


 

Trackback: http://tb.donews.net/TrackBack.aspx? Postid = 1107388

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.