Original: http://blog.csdn.net/woslx/article/details/49685111
The UTF-8 encoding may be 2 bytes, 3 bytes, 4 bytes of characters, but the UTF8 encoding of MySQL only supports 3 bytes of data, and the expression data on the mobile side is a 4-byte character. If you insert emoticons directly into a UTF-8 encoded database, the SQL exception will be reported in the Java program:
Java.sql.SQLException: Incorrect string value: ' \xf0\x9f\x92\x94 ' for column ' name ' at row 1
At Com.mysql.jdbc.SQLError.createSQLException (sqlerror.java:1073)
At Com.mysql.jdbc.MysqlIO.checkErrorPacket (mysqlio.java:3593)
At Com.mysql.jdbc.MysqlIO.checkErrorPacket (mysqlio.java:3525)
At Com.mysql.jdbc.MysqlIO.sendCommand (mysqlio.java:1986)
At Com.mysql.jdbc.MysqlIO.sqlQueryDirect (mysqlio.java:2140)
At Com.mysql.jdbc.ConnectionImpl.execSQL (connectionimpl.java:2620)
At Com.mysql.jdbc.StatementImpl.executeUpdate (statementimpl.java:1662)
At Com.mysql.jdbc.StatementImpl.executeUpdate (statementimpl.java:1581)
The 4-byte characters can be encoded and stored, and then decoded when they are taken out. But doing so will encode and decode any place where the character is used.
UTF8MB4 encoding is a UTF8 encoded superset, compatible with UTF8, and can store 4-byte emoji characters.
The advantage of using UTF8MB4 encoding is: when storing and retrieving data, it is no longer necessary to consider the encoding and decoding of the expression characters.
Change the encoding of the database to UTF8MB4:
1. mysql version
The minimum MySQL version of UTF8MB4 supports version 5.5.3+, if not, upgrade to a newer version.
2. mysql Driver
5.1.34 available, minimum not less than 5.1.13
3. Modify the MySQL configuration file
Modify MySQL configuration file my.cnf (Windows My.ini)
MY.CNF generally in the etc/mysql/my.cnf position. When you find it, add the following three sections:
[Client]
Default-character-set = Utf8mb4
[MySQL]
Default-character-set = Utf8mb4
[Mysqld]
Character-set-client-handshake = FALSE
Character-set-server = Utf8mb4
Collation-server = Utf8mb4_unicode_ci
init_connect= ' SET NAMES utf8mb4 '
4. Restart the database and check the variables
SHOW VARIABLES WHERE variable_name like ' character_set_% ' OR variable_name like ' collation% ';
variable_name |
Value |
Character_set_client |
Utf8mb4 |
Character_set_connection |
Utf8mb4 |
Character_set_database |
Utf8mb4 |
Character_set_filesystem |
Binary |
Character_set_results |
Utf8mb4 |
Character_set_server |
Utf8mb4 |
Character_set_system |
Utf8 |
Collation_connection |
Utf8mb4_unicode_ci |
Collation_database |
Utf8mb4_unicode_ci |
Collation_server |
Utf8mb4_unicode_ci |
Collation_connection, Collation_database, Collation_server is okay.
But it must be ensured
System Variables |
Description |
Character_set_client |
(the character set used by the client source data) |
Character_set_connection |
(Connection layer Character set) |
Character_set_database |
(the default character set for the currently selected database) |
Character_set_results |
(query result character set) |
Character_set_server |
(The default internal operation character set) |
These variables must be UTF8MB4.
5. Configuration of database connections
Database connection parameters:
Characterencoding=utf8 will be automatically recognized as UTF8MB4, or can not add this parameter, will be automatically detected.
And the autoreconnect=true must be added.
6. Convert the database and the already built tables to UTF8MB4
Change database encoding: Alter DB caitu99 CHARACTER SET utf8mb4
COLLATE utf8mb4_general_ci
;
Change table encoding: alter-table TABLE_NAME
CONVERT to CHARACTER SET utf8mb4
COLLATE utf8mb4_general_ci
;
If necessary, you can also change the encoding of the column
Change the MySQL database encoding to UTF8MB4