Teaches you how to get the database to support emoji emoji storage

Source: Internet
Author: User

From:http://www.cnblogs.com/janehoo/archive/2016/04/06/5359800.html

Teach you how to make a database support emoji emoji storage

Workaround: Change the character set UTF8-->UTF8MB4

Last week there was a developer's feedback on a problem: The foreground app crawled the microblogging information, and there were a few data insertions every day that were unsuccessful. The Application log shows:

Java.sql.SQLException:Incorrect string value: ' \xf0\x9f\x92\xaa ',... ' for column ' Raw_json ' at row 1, exception: Org.springframe Work.jdbc.UncategorizedSQLException:

Where the Raw_json field appears as follows:

It is known that the inserted field contains emoji emoticons. It is basically possible to determine the problem of character encoding. Our database is using UTF8 encoding, ordinary string or expression are occupied 3 bytes, so utf8 enough to use, but the mobile side of the emoji placeholder is 4 bytes, the ordinary utf8 is not enough, in order to deal with the wireless internet opportunities and challenges, to avoid emoji The problem with emoticons, the wireless-related MySQL database recommendations are in advance to adopt the UTF8MB4 character set, which must be the key to a technology selection in the mobile internet industry.

UTF8 and utf8mb4 Description:

UTF-8: Unicode transformation Format-8bit, which allows BOMs, but usually does not include a BOM. is a multi-byte encoding used to solve the international character, which uses 8 bits (or one byte) in English, and Chinese uses 24 (three bytes) to encode. UTF-8 contains the characters that all countries in the world need to use, and is an international code with strong versatility. UTF-8 encoded text can be displayed on browsers that support the UTF8 character set in each country. If it is UTF8 code, it can also display Chinese in the foreigner's English ie, they do not need to download IE's Chinese language support package.

UTF8MB4: MySQL added utf8mb4 code after 5.5.3, Mb4 is the most bytes 4 meaning, specifically for compatibility with four-byte Unicode.

If you find a reason, do it. Since only the Raw_json column contains emoticons, try modifying the character set of this column first:

1 ALTER TABLE xx_pnl_weibo_usershow  <br>MODIFY ` raw_json` varchar(3000) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

After you change the test, the insertion fails.

Try to modify the character set of the table:

1 alter table xx_pnl_weibo_usershow CHARSET=utf8mb4 ;  (修改表的默认字符集)
ALTER TABLE Xx_pnl_weibo_usershow CONVERT to CHARACTER set utf8mb4 COLLATE utf8mb4_unicode_ci; (Modify table and field character set)

After you change the test, the insertion fails.

To modify a library-level character set

12345678 vi /etc/my.cnf [client]  default-character-set=utf8mb4  [mysqld]  character-set-server = utf8mb4  service mysqld restart

After the test is finished, the insertion succeeds. To view the inserted data:

[Email protected] Zx_prod >select Id,raw_json,hex (Raw_json), Length (Raw_json) from Xx_pnl_weibo_usershow where id=2;

+----+------------------+----------------------------------+---------------------+
| ID | Raw_json | Hex (Raw_json) | Length (Raw_json) |
+----+------------------+----------------------------------+---------------------+
| 2 |???? | f09f9894f09f9882f09f9898f09f9894 | 16 |
+----+------------------+----------------------------------+---------------------+
1 row in Set (0.00 sec)

Because it is an online production business, testing needs to be done in the test environment first. Avoid peak business hours and apply offline window operations online.

Emoji Expression Code reference:

Http://punchdrunker.github.io/iOSEmoji/table_html/flower.html

Teaches you how to get the database to support emoji emoji storage

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.