Processing of Emoji and string emoji

Source: Internet
Author: User

Processing of Emoji and string emoji

I suffered a loss of experience because the project bug caused by the Emoji expression was hit twice by the hacker. There are always some naughty children who love to do something with their expressions. For the first time, we changed the table to utf8mb4, and the second statement was hard to solve. After searching for the Internet for half a day, I found that many people did not copy others' code and sent it online without having to experiment, which caused me to use it directly. The last attempt has finally solved the problem and prevented further error. Let's summarize it.

Our MYSQL database commonly used character set is UTF-8, the default is utf8_general_ci, this character set, the default is support 1-3 bytes of encoding, of course, this letter, Chinese characters are no problem. But the Emoji of the mobile phone cannot be used because it is 4 bytes long.

Here are some of the solutions:

First, modify the database character set:

The hard requirement for this method is that your mysql database version is later than 5.5. If you have a database management tool, you just need to open it and change it. For example, if you use HeidiSQL, you can change the table to utf8mb4. You can adjust the default character set.

This method is simple and easy to use, but you may need to restart the database. There is another problem: Sometimes this method is not very good. I used this method for the first time to solve the problem perfectly, but for the second time, nothing can be done. Therefore, this method is not recommended.

Second, filter out these expressions.

Since the database cannot be saved, filter out these expressions directly. This is a way to make the service more convenient by damaging the customer's personality. At present, many websites do this. After all, efficiency is the key. Even if you save this expression, you may not be able to use it again.

There are too many pitfalls in filtering such a thing. For example, I tried this code many times:

The author once strongly believes that this is the code closest to the answer to the emoticons. If not, just change it. However, after many times, all the letters and Chinese characters are filtered into expressions, which is still unsolved. Ah, it's still too young.

There is no way to find other code, so the correct one is also the most recommended answer:

/*** Replace emoji ** @ param source original String * @ param slipStr the String replaced by emoji * @ return filtered String */public static String filterEmoji (String source, string slipStr) {if (StringUtils. isNotBlank (source) {return source. replaceAll ("[\ ud800 \ udc00-\ udbff \ udfff \ ud800-\ udfff]", slipStr);} else {return source ;}}


It is recommended to make a tool method, which is convenient and practical and feasible for the test.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.