Problem background:
At night in the crawl of a website data, the result found in the packet of such a string of encoded data: "... \u65b0\u6d6a\u5fae\u535a ...", this is actually the Chinese Unicode code after the data, I now just want to decode the Chinese, On the mother for half a day, tried a lot of posture (square) potential (law), finally got it.
Solution:
Oh, the foreigner is to force Ah, poke here to see the foreigner to the solution
Scheme A (stable version + recommended):
function Replace_unicode_escape_sequence ($match) {return
mb_convert_encoding (Pack (' h* ', $match [1]), ' UTF-8 ', ' Ucs-2be ');
}
$name = ' \u65b0\u6d6a\u5fae\u535a ';
$str = Preg_replace_callback ('/\\\\u ([0-9a-f]{4})/I ', ' replace_unicode_escape_sequence ', $name);
Echo $str; Output: Sina Micro Blog
I put the above proposal A to encapsulate ~ ~ ~ (Scheme a stable edition + upgrade + recommend)
class Helper_tool
{
static function Unicodedecode ($data)
{
function Replace_unicode_escape_sequence ($match) {return
mb_convert_encoding (Pack (' h* ', $match [1]), ' UTF-8 ', ' Ucs-2be ');
}
$rs = Preg_replace_callback ('/\\\\u ([0-9a-f]{4})/I ', ' replace_unicode_escape_sequence ', $data);
return $rs;
}
Call
$name = ' \u65b0\u6d6a\u5fae\u535a ';
$data = Helper_tool::unicodedecode ($name); Output Sina Weibo
Programme B (Recommended):
<?php
function Unicodedecode ($name) {
$json = ' {' str ': '. $name. ' "} ';
$arr = Json_decode ($json, true);
if (empty ($arr)) return ";
return $arr [' str '];
}
$name = ' \u65b0\u6d6a\u5fae\u535a ';
echo Unicodedecode ($name); Output: Sina Micro Blog
For scenario B, I want to focus on the note, in the friend Xar (Xar blog) technical support, summed up the string to be processed (that is, the parameters passed to the function Unicodedecode $name must not contain single quotes, or it will cause parsing failure, So if necessary, you can use the Str_replace () function to format illegal characters as qualifying characters