When a group of friends collected the webpage last night, they found that the webpage obtained by file_get_contents was saved as garbled code locally, and Content-Encoding: gzip in the Response header
But it is normal in the browser.
Because I have some experience, I immediately found that the website opened gzip and file_get_contents obtained compressed pages, rather than decompressed pages (I don't know if I want to request a webpage using file_get_conttents with corresponding parameters to directly obtain webpages not compressed by gzip ?)
I just recently saw that the first two bytes of the file can be read to determine the file type. Friends in the group also said that the first two bytes of the gzip compressed webpage (gbk encoding) are 1F 8B, so they can determine whether the webpage is compressed by gzip.
The Code is as follows:
Copy codeThe Code is as follows: // mill military network uses gzip to compress the webpage
// The webpage directly obtained by file_get_contents is garbled.
Header ('content-Type: text/html; charset = UTF-8 ');
$ Url = 'HTTP: // www.miercn.com ';
$ File = fopen ($ url, "rb ");
// Read-only 2 bytes. If the value is (in hexadecimal format) 1f 8b (in hexadecimal format) 31 139, gzip is enabled;
$ Bin = fread ($ file, 2 );
Fclose ($ file );
$ StrInfo = @ unpack ("C2chars", $ bin );
$ TypeCode = intval ($ strInfo ['chars1']. $ strInfo ['chars2']);
$ IsGzip = 0;
Switch ($ typeCode)
{
Case 31139:
// The website has enabled gzip.
$ IsGzip = 1;
Break;
Default:
$ IsGzip = 0;
}
$ Url = $ isGzip? "Compress. zlib: //". $ url: $ url; // ternary expression
$ MierHtml = file_get_contents ($ url); // obtain mill Military Network Data
$ MierHtml = iconv ("gbk", "UTF-8", $ mierHtml );
Echo $ mierHtml;