One site may experience the conversion from gb2312 (gbk, big5) to utf8, which may encounter many problems. What should I do if the site is too large? I have to step by step. If the front-end code is rarely changed, data conversion will make the whole process much easier. After several days of testing, I found that Mysql can use utf8 to store gbk output. Mysql has a feature. You can specify the character set used for the current client connection. mysql is latin1 by default, or the character set configured by mysql server is used for connection verification. I use utf8_general_ci to create fields.
DB:
SQL code:
Copy codeThe Code is as follows:
Create TABLE 'table '(
'Id' INT (10) not null,
'Name' VARCHAR (50) character set utf8 COLLATE utf8_general_ci not null,
INDEX ('G _ id ')
) ENGINE = innodb character set utf8 COLLATE utf8_general_ci;
PHP:
The storage operation specifies that the utf8 character set is used for connection verification, and the read operation specifies that the gbk character set is used for connection verification.
PHP code:
Copy codeThe Code is as follows:
<? Php
// Select DB And Set Link Use UTF8
Function _ select_db_utf ()
{
Mysql_select_db ($ this-> db_name, $ this-> db_link );
// Init character
Mysql_query ("set names utf8", $ this-> db_link );
Mysql_query ("set character set utf8", $ this-> db_link );
Mysql_query ("SET COLLATION_CONNECTION = 'utf8 _ general_ci '", $ this-> db_link );
Return true;
}
// Select DB And Set Link Use GBK
Function _ select_db_gb ()
{
Mysql_select_db ($ this-> db_name, $ this-> db_link );
// Init character
Mysql_query ("set names gbk", $ this-> db_link );
Mysql_query ("set character set gbk", $ this-> db_link );
Mysql_query ("SET COLLATION_CONNECTION = 'gbk _ chinese_ci '", $ this-> db_link );
Return true;
}
?>
Notes:
1. mysql must compile the gbk, gb2312, utf8, and other character sets.
2. Ensure that the imported data is UTF8 encoded correctly.
3. for storage and read operations, specify the correct character set for connection verification.
If the front-end code operation data cannot be written into the database using UTF8, You Need To transcode the characters. (For example, if the data submitted using AJAX is correct UTF8, no conversion is required .)
Because mb_string is the most comprehensive character supported by PHP, and iconv is a little worse than it, mb_string does not fully support Transcoding of some special characters, so there is no perfect transcoding method so far.
Compare mb_string and iconv again:
Mb_string:
1. The most widely supported characters
2. The content is automatically recognized and the encoding of the original character is not required, but the execution efficiency is much lower than that of iconv.
3. $ content = mb_convert_encoding ($ content, "UTF-8", "GBK, GB2312, BIG5"); (effects vary depending on the Order)
Iconv:
1. All supported characters
2. Determine the encoding of the original character, but the execution efficiency is higher than that of mb_convert_encoding when encoding is determined.
3. $ content = iconv ("GBK", "UTF-8", $ content );