Incorrect string value: ' \xf0\x9f\x98\x84\xf0\x9f

Source: Internet
Author: User
Tags mysql version

Problem description: From Sina Weibo crawl message saved to MySQL data, the corresponding database field varchar, character encoding utf-8. Partial insert succeeded, partial insert failed, error as title.


In the online query, some people say is the coding problem, proposed to modify the encoding format, such as Gbk,utf-8,blob and so on, but few people give a more detailed answer. On an English website, we found the cause of the real mistake. Link 1 Link 2


Error Reason: we can see the character 0xf0 0x9F 0x98 0x84 in the error hint, which corresponds to the 4-byte encoding (UTF-8 encoding specification) in the UTF-8 encoding format. Normal Chinese characters are generally not more than 3 bytes, why is the occurrence of 4 bytes. In fact, it corresponds to the smart phone input method in the expression. Then why is the error? Because utf-8 in MySQL is not really a utf-8, it can only store utf-8 encodings of 1~3 byte lengths, if you want to store 4 bytes of the type that must be utf8mb4. Instead of using the UTF8MB4 type, first make sure that the MySQL version is either less than the MySQL 5.5.3.


Solution:

1 Use UTF8MB4 data type

To use this strategy, if the MySQL version is less than 5.5.3, start with a version upgrade, and then change the corresponding data type to the UTF8MB4 type. If you are using a connector/j connection database, you need to change the encoding format to UTF8MB4 (set character_set_server=utf8mb4 in the Connection config) in the configuration.

2 Custom filtering rules that filter or convert the four-byte UTF-8 characters that appear in the text to a custom type.

Here is a test example that converts a 4-byte character to 0000.


for (int i = 0; i < b_text.length; i++)
{
    if (B_text[i] & 0xf8) = 0xF0) {for
        (int j = 0; J < 4; J +) {						
	    b_text[i+j]=0x30;					
	}
	I+=3
    }
}


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.