How MySQL stores emoji expressions

Source: Internet
Author: User
Tags mysql client

UTF8 database, save emoticons, make an error

The code is as follows Copy Code
Incorrect string value: ' \xf0\x9f\x98\x84\xf0\x9f ... ' for column ' content '




The wrong solution:

The code is as follows Copy Code

4 byte Unicode characters aren ' t yet widely used, so is every application out there fully supports. MySQL 5.5 works fine with 4 byte characters when properly configured–check if your other components can work with them a S.

Here ' s a few the other things to check out:

Make sure all your tables ' default character sets and text fields are to converted, in UTF8MB4 to addition the CLI ENT & Server character sets, e.g. ALTER TABLE mytable charset=utf8mb4, MODIFY COLUMN textfield1 VARCHAR (255) character Set utf8mb4,modify COLUMN textfield2 VARCHAR (255) CHARACTER set UTF8MB4; And so on.

If your The data is already in the UTF8 character set, it should convert to utf8mb4 in place without any problems. As always, back up your data before trying!

Also Make sure your the app layer sets its database connections ' character set to UTF8MB4. Double-check this are actually happening–if you ' re running a older version of your chosen framework ' s MySQL client libra Ry, it may isn't have been compiled with UTF8MB4 support and it won ' t set the CharSet properly. If not, with may have to update it or compile it yourself.

When viewing your data through to the MySQL client, make sure you ' re on a machine that can display emoji, and run a SET NAMES UTF8MB4 before running any queries.

Once every level of your application can support the new characters, you should is able to use them without any corruption .




The summary is that the table structure is changed to support 4 bytes of Unicode, and the database connection is also used in this character set, which proves to be feasible.
If other places do not support, consider removing these characters:

  code is as follows copy code
since 4-byte UTF-8 sequences always start with the bytes 0xf0-0xf7, the following should:

$str = Work (' Preg_replace 7].../s ', ', $str);
Alternatively, you could use Preg_replace in UTF-8 mode but this'll probably be slower:

$str = Preg_replac E ('/[\x{10000}-\x{10ffff}]/u ', ', $str);
This works because 4-byte UTF-8 sequences are used for code points in the supplementary Unicode planes starting from 0x10000.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.