Solving the problem of PHP processing emoji expression Unicode character transcoding code

Source: Internet
Author: User
Tags pack

What is emoji? Let's take a look at the explanation of Baidu Encyclopedia

Emoji is the expression symbol, from the Japanese word "text" (the pseudonym is "えもじ", the pronunciation is emoji).

Emoji's creator, the Japanese, is Kurita (Shigetaka Kurita), who is looking at elements of his childhood for inspiration, such as Japanese comics and Japanese characters. "There are many different symbols in Japanese comics," he said. The cartoonist paints a few expressions, showing a person sweating or bursting into an idea when a light bulb appears on the head. At the same time, from the Japanese characters, he gained a kind of ability, using simple characters to express "secret" and "Love" and other abstract concepts.

In August 2014, the Oxford Dictionary online edition (Oxford Dictionary on-line) added "emoji" to the new vocabulary, which also meant that it had become a formal term.

The emoji expression in the micro-letter


Because the emoji expression used in the micro-letter interface is a UTF-8 binary string that is not decoded, the expression is displayed as a block type when receiving a emoji expression from a user of a micro-client "?" Or a character that can't be displayed, you need to turn it on. Similarly, when sending a text message with a EMOJIB expression to a micro-server, you also need to send emoticons to this format (the earlier micro-mail can send a Unicode code to display the emoji expression, but now it is not supported).

Each emoji expression actually has the corresponding Unicode code, when resolves the user to the public number sends the emoji expression character, we may match or the storage information in the emoji expression according to the Unicode code, similarly when sends the text message which contains the emoji expression to the user, The expression character is then sent after binary transcoding based on the Unicode encoding. This is recommended for Unicode conversion using the SoftBank version of the emoji encoding, for example, "u+e04a" to "\ue04a", which matches the micro-letter. The end of the article is attached with my converted emoji Unicode encoded datasheet, which can be combined with Github.com/iamcal/php-emoji style sheets and pictures.

Here's how I handle it. The first is the parsing portion of the message when it is received:

When you receive a text that may contain emoji expressions, you can simply use Json_encode ($STR) to encode it in JSON, and the expression, Chinese, and other characters in the message will be converted to Unicode encoding. (The JSON encoding here is to get the Unicode code for the character, so there's no need to add optional arguments to avoid Unicode in the Json_encode function)

For example: "Hello?" Hello 123″ will be encoded as "\u4f60\u597d \ue415 Hello 123"

The \ue415 in the character is a emoji expression, at which point we can make a regular judgment of the character to filter out what is emoji expression. My approach is to escape the emoji Unicode string backslash and then restore the character json_decode so that you can restore characters other than emoji (without affecting other characters such as Chinese), leaving only the emoji Unicode code.

You can also use other methods, such as making a mark and replacement, such as "\ue415″ replaced with" [em:ue415], similar to the way QQ expression, when you need to display text and expression, it can be convenient for the expression characters to make matching rendering. Or you can simply replace it with an IMG tag of HTML to put the expression in the image, but this is not conducive to maintenance.

I use the ue000 is rough and simple, choose the right? Uefff between the characters as emoji, for the time being found no accidental injury:

$str = Preg_replace ("# (\\\ue[0-9a-f]{3}) #ie", "addslashes (' \\1 ')", $str);

The complete code for the entire process is as follows:

$text = "How are you?" Hello 123 "; May contain a binary emoji expression string for incoming micro-mail messages
$TMPSTR = Json_encode ($text); To expose the Unicode
$TMPSTR = Preg_replace ("# (\\\ue[0-9a-f]{3}) #ie", "addslashes (' \\1 ')", $tmpStr); Will emoji's Unicode left, the other does not move
$text = Json_decode ($TMPSTR);

echo $text/Hello \ue415 Hello 123

Then you can store the information, and when you read the information to the page, you can do character substitution and template rendering.

Here you can refer to the style sheets and pictures provided by this project for emoticons: Github.com/iamcal/php-emoji


Here is the sending section, which is much simpler:

For a text message that contains emoji, you can put the Unicode character in first, or take this paragraph to give an example: "Hello \ue415 hello 123".

Then use the regular filter out the text in the emoji Unicode, binary Pack,utf8 transcoding, and then into the original text can be (this step should be in the message before sending the last to do, prepare the full text message before the transcoding processing). The code is as follows:

$text = "Hello \ue415 hello 123"; Can be sent for the micro-mail message, including emoji expression Unicode string, need to be converted into UTF8 binary strings
$text = Preg_replace ("#\\\u" ([0-9a-f]+) #ie "," Iconv (' UCS-2 ', ' UTF-8 ', pack (' H4 ', ' \\1 ')) ", $text); Binary pack for emoji Unicode and turn UTF8

echo $text;//hello? Hello 123

This can be sent to the micro-trust server.

The expression data and CSS style names refer to this project: Github.com/iamcal/php-emoji, which can be used in combination.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.