Issue background:
At night crawling a Web site data, the results found in the packet of such a string of encoded data: "... \u65b0\u6d6a\u5fae\u535a ...", this is actually Chinese Unicode encoded data, I now just want to decode the Chinese, On the degree Niang engaged for a long while, tried a lot of posture (square) potential (law), finally settled.
Solution:
Oh, the foreigner is to force Ah, poke here to see the solution of the foreigner
Scenario A (Stable + recommended):
function Replace_unicode_escape_sequence ($match) {
Let's wrap up the above scenario a (Plan a stable + upgrade + recommended) class helper_tool{ static function Unicodedecode ($data) { function Replace_unicode_escape_sequence ($match) { return mb_convert_encoding (Pack (' h* ', $match [1]), ' UTF-8 ', ' ucs-2be ') ; } $rs = Preg_replace_callback ('/\\\\u ([0-9a-f]{4})/I ', ' replace_unicode_escape_sequence ', $data); return $rs; } } Call $name = ' \u65b0\u6d6a\u5fae\u535a '; $data = Helper_tool::unicodedecode ($name); Output Sina Weibo
Scenario B (Recommended):
<?phpfunction Unicodedecode ($name) { $json = ' {' str ': '. $name. '} '; $arr = Json_decode ($json, true); if (empty ($arr)) return ';
For scenario B, I would like to highlight the following considerations, in the friend XAR (Poke XAR blog) technical support, summed up the string to be processed (that is, the parameters passed to the function Unicodedecode $name content must not contain single quotation marks, otherwise it will lead to parsing failure, So if necessary, you can use the Str_replace () function to format illegal characters as qualified characters)