In-depth understanding of Java and MySQL garbled characters

Source: Internet
Author: User

Recently, we used tomcat and MySQL to build a Java Web server and deploy the game server logic on it.

 

Shortly after the game was launched, we found a large number of garbled characters in the database. This is a very serious problem and must be solved immediately. But where is the problem? According to the analysis, garbled characters can only appear at two time points:

1. When data is transmitted from the client to the server.

2. When the server stores data in the database.

 

After debugging, we found that the data output from the server is normal, so garbled characters can only occur when the storage is stored in the database.

 

Since it is a MySQL database encoding problem, it is relatively easy to do. First, enter the command:

Mysql> show variables like '% char % ';

+ -------------------------- + ---------------- +

| Variable_name | Value |

+ -------------------------- + ---------------- +

| Character_set_client | utf8 |

| Character_set_connection | utf8 |

| Character_set_database | latin1 |

| Character_set_filesystem | binary |

| Character_set_results | utf8 |

| Character_set_server | latin1 |

| Character_set_system | utf8 |

+ -------------------------- + ---------------- +

The preceding command is used to display the database encoding used by MySQL. After seeing so many data encodings, my head is a little big, and I can't help it. I just have to figure out these parameters one by one.

L character_set_server: If no character encoding is specified during database creation, the system uses the character_set_server value as the default value.

L character_set_database: If no character encoding is specified during table creation, the system uses the character_set_database value as the default value.

L character_set_client: defines the encoding of data sent by the MySQL client.

L character_set_connection: After the MySQL server receives the data sent from the client, the data is converted to the encoding specified by character_set_connection.

L character_set_results: character encoding used by the MySQL server to return the query results.

 

After knowing the meaning of these parameters, we are not so helpless, so I will review the data storage and warehouse receiving process in my mind.

 

The Insert Process is as follows through the MySQL client. The MySQL client encodes the data entered by the user into character_set_client and sends it to the server. After receiving the data, the server converts it to character_set_connection, the server then stores the data as character_set_table (the above parameters do not include this, that is, if the table is not specified, this value is character_set_database ).

 

Recall that at the time of table creation, we did not specify the character encoding after the statements for table creation. According to the preceding description, the database table uses the character encoding specified by character_set_database, that is, latin1. Since UTF-8-Encoded chinese cannot be stored as latin1, the database will convert unrecognizable characters ?, And this process is irreversible.

 

To prove this, I tried to insert Chinese characters into the table through the MySQL client, and the result failed to be inserted. So I decided to change the character encoding of the table to a UTF-8 to see if Chinese can be inserted into the table, after executing the following statement:

Mysql> alter table 'tablename' defaultcharacter set utf8 COLLATE utf8_general_ci

The character encoding of the table was modified to the UTF-8, and it was found that Chinese characters could be inserted into the table.

 

At that time, I thought it was over. But after we started tomcat for testing, we found that the database was still garbled. Since MySQL clients can insert Chinese characters, why does JDBC not work? It is very likely that JDBC does not use character_set_client to encode client data.

 

After reading numerous documents, we found that if the connection encoding is not specified through the characterEncoding attribute in the jdbc url, the JDBC Driver uses character_set_server as the connection encoding. In this case, latin1 is used. Now that the cause is found, the problem is better solved.

 

Modify the jdbc url to jdbc: mysql: // localhost/some_db? UseUnicode = yes

& CharacterEncoding = UTF-8. If you are using a tomcat data source, replace '&' with '& amp ;'.

 

Finally, when the server is started again, it is found that the database finally contains the correct characters.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.