Php Data garbled

Source: Internet
Author: User
Tags 0xc0 debian server
File_get_contents is used to collect data from a page. the obtained data is garbled and the encoding method has been used,
UTF-8 is detected. my page encoding is UTF-8, but it still displays garbled characters. I don't know why.

$url="xxx";$opts = array(   'http'=>array(     'user_agent' => "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)",  ) ); $context = stream_context_create($opts); $neirong = file_get_contents($url, false, $context); header("content-Type: text/html; charset=Utf-8");  ob_end_flush(); $encode = mb_detect_encoding($neirong, array("ASCII","UTF-8","GB2312","GBK","BIG5"));          echo $encode."
";if ($encode!="UTF-8"){ $neirong=mb_convert_encoding($neirong, "UTF-8", $encode); } echo $neirong;


$ Encode: UTF-8
$ Neirong output is garbled
My page code is UTF-8


Reply to discussion (solution)

...$neirong = file_get_contents($url, false, $context);echo base64_encode($neirong);

Post result

...$neirong = file_get_contents($url, false, $context);echo base64_encode($neirong);

Post result



It's an article. The result is too long. I'll post it for a while.

77u/ICAgIOiwjeeWlOahteaYlOaZhOixkeebqOaak++8jOmAveS6juWHkeS+k+WyqeeGmueeg+ebqOa0headremAv

$c = '77u/ICAgIOiwjeeWlOahteaYlOaZhOixkeebqOaak++8jOmAveS6juWHkeS+k+WyqeeGmueeg+ebqOa0headremAv';echo base64_decode($c);


Too many? Ever ????,? Yuji? Rock ???? Hang ??
? Yes ?? Ah. Then? Why? Your base64 is incomplete.

$c = '77u/ICAgIOiwjeeWlOahteaYlOaZhOixkeebqOaak++8jOmAveS6juWHkeS+k+WyqeeGmueeg+ebqOa0headremAv';echo base64_decode($c);


Too many? Ever ????,? Yuji? Rock ???? Hang ??
? Yes ?? Ah. Then? Why? Your base64 is incomplete.


The correct output should be "what makes Tian Shuxin speechless is that this Liu Bo really doesn't mean this. "
It's garbled.

Put? Set address? Outbound ?.

Put? Set address? Outbound ?.



This is the data collection address.
Http://www.ziyouge.com/conbdhekbefiab

This is the display page of its website.
Http://www.ziyouge.com/zy/4/4980/1333249.html

The data of the collected address is abnormal, but its page is displayed normally.

Yes ?? Set ?? Do something? .

 = 224) {$ result. = change (mb_substr ($ content, $ I, 3); $ I = $ I + 3;} else {$ result. = mb_substr ($ content, $ I, 1); $ I = $ I + 1 ;}} echo $ result ;//? Function change ($ str) {$ ignore = array (''', '"', '! ','... ',': ','); If (in_array ($ str, $ ignore) {return $ str ;}$ prefix = "% u "; $ postfix = ""; $ str = iconv ('utf-8', 'ucs-2', $ str); $ arrstr = str_split ($ str, 2 ); $ unistr = ''; for ($ I = 0, $ len = count ($ arrstr); $ I <$ len; $ I ++) {$ tmp = hexdec (bin2hex ($ arrstr [$ I]); $ tmp = str_pad (dechex ($ tmp), 4, '0', STR_PAD_LEFT ); $ tmp = decrypt (substr ($ tmp, 2, 2 ). substr ($ tmp, 0, 2); $ unistr. = $ prefix. $ tmp. $ postfix;} return Unescape ($ unistr);} // decrypt function decrypt ($ d) {$ result. = str_pad (dechex (hexdec ($ d)-100), 4, '0', STR_PAD_LEFT); return $ result ;}//? Chinese function unescape ($ str) {$ ret = ''; $ len = strlen ($ str); for ($ I = 0; $ I <$ len; $ I ++) {if ($ str [$ I] = '%' & $ str [$ I + 1] = 'u ') {$ val = hexdec (substr ($ str, $ I + 2, 4); if ($ val <0x7f) $ ret. = chr ($ val); else if ($ val <0x800) $ ret. = chr (0xc0 | ($ val> 6 )). chr (0x80 | ($ val & 0x3f); else $ ret. = chr (0xe0 | ($ val> 12 )). chr (0x80 | ($ val> 6) & 0x3f )). ch R (0x80 | ($ val & 0x3f); $ I + = 5;} else if ($ str [$ I] = '%') {$ ret. = urldecode (substr ($ str, $ I, 3); $ I + = 2;} else $ ret. = $ str [$ I];} return $ ret;}?>



It's already around 11 o'clock in the evening. what about the path? Who is it, but there are three items that are still in high spirits, and they do not mean to give up until dawn,

Fdipzone: garbled characters are output in your method, but you are not familiar with decryption.

Are you there? Add to html

Its source ?? Why ?? Me? Program already? Yes ???? .

I put? Set? Outbound ?, Direct? You can.

 $ V) {$ headerArr [] = $ n. ':'. $ v ;}$ ch = curl_init (); curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, true); curl_setopt ($ ch, CURLOPT_URL, $ url); curl_setopt ($ ch, CURLOPT_HTTPHEADER, $ headerArr); // Construct IPcurl_setopt ($ ch, CURLOPT_REFERER, 'http: // www.ziyouge.com/'); // Construct a ro $ content = curl_exec ($ ch ); $ content = substr ($ content, 3); if ($ error = curl_error ($ ch) {die ($ error);} curl_close ($ ch ); // analysis program $ result = ''; $ str_length = mb_strlen ($ content); $ I = 0; while ($ I <= $ str_length) {$ temp_str = mb_substr ($ content, $ I, 1); $ ascnum = Ord ($ temp_str); if ($ ascnum> = 224) {$ result. = change (mb_substr ($ content, $ I, 3); $ I = $ I + 3;} else {$ result. = mb_substr ($ content, $ I, 1); $ I = $ I + 1 ;}} echo'
 '; Echo $ result ;//? Function change ($ str) {$ ignore = array (''', '"', '! ','... ',': ','); If (in_array ($ str, $ ignore) {return $ str ;}$ prefix = "% u "; $ postfix = ""; $ str = iconv ('utf-8', 'ucs-2', $ str); $ arrstr = str_split ($ str, 2 ); $ unistr = ''; for ($ I = 0, $ len = count ($ arrstr); $ I <$ len; $ I ++) {$ tmp = hexdec (bin2hex ($ arrstr [$ I]); $ tmp = str_pad (dechex ($ tmp), 4, '0', STR_PAD_LEFT ); $ tmp = decrypt (substr ($ tmp, 2, 2 ). substr ($ tmp, 0, 2); $ unistr. = $ prefix. $ tmp. $ postfix;} return Unescape ($ unistr);} // decryption function decrypt ($ d) {$ result = str_pad (dechex (hexdec ($ d)-100), 4, '0 ', STR_PAD_LEFT); return $ result ;}//? Chinese function unescape ($ str) {$ ret = ''; $ len = strlen ($ str); for ($ I = 0; $ I <$ len; $ I ++) {if ($ str [$ I] = '%' & $ str [$ I + 1] = 'u ') {$ val = hexdec (substr ($ str, $ I + 2, 4); if ($ val <0x7f) $ ret. = chr ($ val); else if ($ val <0x800) $ ret. = chr (0xc0 | ($ val> 6 )). chr (0x80 | ($ val & 0x3f); else $ ret. = chr (0xe0 | ($ val> 12 )). chr (0x80 | ($ val> 6) & 0x3f )). ch R (0x80 | ($ val & 0x3f); $ I + = 5;} else if ($ str [$ I] = '%') {$ ret. = urldecode (substr ($ str, $ I, 3); $ I + = 2;} else $ ret. = $ str [$ I];} return $ ret;}?>

Are you there? Add to html

Its source ?? Why ?? Me? Program already? Yes ???? .

I put? Set? Outbound ?, Direct? You can.

[/Code]



The error code is displayed because php versions are different. I tested it normally in 5.3.28. in PHP 6.0.0-dev, the test is garbled. Is it because PHP 6.0.0-dev lacks any components?

Maybe, dev ..

Maybe, dev ..


The local version 5.3.28 is normal, and garbled characters appear again when you switch to the server version 5.3.28...
Linux environment, ubuntu local, and Debian server

Estimate ?? Php mb string versions ?.
? Environment ?? Depends on yourself? Excuse me, why ??? Yes ?? More? Environment.

Estimate ?? Php mb string versions ?.
? Environment ?? Depends on yourself? Excuse me, why ??? Yes ?? More? Environment.


The problem has been found. Different platforms
$ Str = iconv ('utf-8', 'ucs-2', $ str); // The output result is different.
// Example: $ str = "? "; $ Str = iconv ('utf-8', 'ucs-2', $ str); the normal result is" V ^ "; the abnormal result is "^ V". ask how to solve this problem.

Find the method .. Different platforms convert different usc-2 codes
For UCS-2, UCS-2BE is by default in linux. Iconv (specifies the UCS-2) is used to convert the unicode of the UCS-2BE. If you convert a UCS-2 from a windows platform, you need to specify a UCS-2LE.

Hmm

$ Str = iconv ('utf-8', 'ucs-2', $ str );
Change?
$ Str = iconv ('utf-8', 'ucs-2le', $ str );

You can.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.