UTF8 database, save emoticons, make an error
The code is as follows |
Copy Code |
Incorrect string value: ' \xf0\x9f\x98\x84\xf0\x9f ... ' for column ' content ' |
The wrong solution:
The code is as follows |
Copy Code |
4 byte Unicode characters aren ' t yet widely used, so is every application out there fully supports. MySQL 5.5 works fine with 4 byte characters when properly configured–check if your other components can work with them a S. Here ' s a few the other things to check out: Make sure all your tables ' default character sets and text fields are to converted, in UTF8MB4 to addition the CLI ENT & Server character sets, e.g. ALTER TABLE mytable charset=utf8mb4, MODIFY COLUMN textfield1 VARCHAR (255) character Set utf8mb4,modify COLUMN textfield2 VARCHAR (255) CHARACTER set UTF8MB4; And so on. If your The data is already in the UTF8 character set, it should convert to utf8mb4 in place without any problems. As always, back up your data before trying! Also Make sure your the app layer sets its database connections ' character set to UTF8MB4. Double-check this are actually happening–if you ' re running a older version of your chosen framework ' s MySQL client libra Ry, it may isn't have been compiled with UTF8MB4 support and it won ' t set the CharSet properly. If not, with may have to update it or compile it yourself. When viewing your data through to the MySQL client, make sure you ' re on a machine that can display emoji, and run a SET NAMES UTF8MB4 before running any queries. Once every level of your application can support the new characters, you should is able to use them without any corruption . |
The summary is that the table structure is changed to support 4 bytes of Unicode, and the database connection is also used in this character set, which proves to be feasible.
If other places do not support, consider removing these characters:
code is as follows |
copy code |
since 4-byte UTF-8 sequences always start with the bytes 0xf0-0xf7, the following should: $str = Work (' Preg_replace 7].../s ', ', $str); Alternatively, you could use Preg_replace in UTF-8 mode but this'll probably be slower: $str = Preg_replac E ('/[\x{10000}-\x{10ffff}]/u ', ', $str); This works because 4-byte UTF-8 sequences are used for code points in the supplementary Unicode planes starting from 0x10000. |