PHP decoding unicode-encoded Chinese character code sharing ,. Sharing the Chinese character code for decoding unicode in PHP. Problem background: capture the data of a website at night and find the encoded data in the data packets :...... u65b0u6d6 PHP decoding unicode Chinese character code sharing,
Problem background:
After capturing the data of a website at night, we found such a string of encoded data in the data packet :"...... \ u65b0 \ u6d6a \ u5fae \ u535a ...... ", this is actually the unicode-encoded data in Chinese. now I want to decode the Chinese language. I 've been doing this for a long time and tried a lot of poses) trend (method), finally settled.
Solution:
Oh, foreigners are awesome. let's see the solutions provided by foreigners.
Solution A (stable version + recommended ):
Function replace_unicode_escape_sequence ($ match) {return mb_convert_encoding (pack ('H * ', $ match [1]), 'utf-8', 'ucs-2be ');} $ name = '\ u65b0 \ u6d6a \ u5fae \ u535a'; $ str = preg_replace_callback ('/\\\ u ([0-9a-f] {4})/I ', 'replace _ unicode_escape_sequence ', $ name); echo $ str; // output: Sina Weibo
// Encapsulate solution ~~~ (Solution A stable version + upgrade + recommended) class Helper_Tool {static function unicodeDecode ($ data) {function replace_unicode_escape_sequence ($ match) {return mb_convert_encoding (pack ('H *', $ match [1]), 'utf-8', 'ucs-2be ');} $ rs = preg_replace_callback ('// \\\ u ([0-9a-f] {4})/I', 'replace _ unicode_escape_sequence ', $ data); return $ rs ;}} // call $ name = '\ u65b0 \ u6d6a \ u5fae \ u535a'; $ data = Helper_Tool: unicodeDecode ($ name); // output Sina Weibo
Solution B (recommended ):
<? Phpfunction unicodeDecode ($ name) {$ json = '{"str ":"'. $ name. '"}'; $ arr = json_decode ($ json, true); if (empty ($ arr) return''; return $ arr ['str'];} $ name = '\ u65b0 \ u6d6a \ u5fae \ u535a'; echo unicodeDecode ($ name); // output: Sina Weibo
For solution B, I would like to pay special attention to the precautions. with the technical support of friend XAR (the XAR blog, summarize the string to be processed (that is, the $ name parameter passed to the unicodeDecode function must not contain single quotation marks; otherwise, the parsing will fail. if necessary, use str_replace () function format invalid characters as qualified characters)
Background: After capturing the data of a website at night, we found the encoded data in the data packet: "... \ u65b0 \ u6d6...