This article describes how to use php to determine whether a webpage is gzip compressed, for more information, see <g id = "1"> </g>.
But it is normal in the browser.
Because I have some experience, I immediately found that the website opened gzip and file_get_contents obtained compressed pages, rather than decompressed pages (I don't know if I want to request a webpage using file_get_conttents with corresponding parameters to directly obtain webpages not compressed by gzip ?)
I just recently saw that the first two bytes of the file can be read to determine the file type. Friends in the group also said that the first two bytes of the gzip compressed webpage (gbk encoding) are 1F 8B, so they can determine whether the webpage is compressed by gzip.
The code is as follows:
The code is as follows:
// Mill military network uses gzip to compress webpages
// The webpage directly obtained by file_get_contents is garbled.
Header ('content-Type: text/html; charset = utf-8 ');
$ Url = 'http: // www.miercn.com ';
$ File = fopen ($ url, "rb ");
// Read-only 2 bytes. if the value is (in hexadecimal format) 1f 8b (in hexadecimal format) 31 139, gzip is enabled;
$ Bin = fread ($ file, 2 );
Fclose ($ file );
$ StrInfo = @ unpack ("C2chars", $ bin );
$ TypeCode = intval ($ strInfo ['chars1']. $ strInfo ['chars2']);
$ IsGzip = 0;
Switch ($ typeCode)
{
Case 31139:
// The website has enabled gzip.
$ IsGzip = 1;
Break;
Default:
$ IsGzip = 0;
}
$ Url = $ isGzip? "Compress. zlib: //". $ url: $ url; // ternary expression
$ MierHtml = file_get_contents ($ url); // Obtain Mill military network data
$ MierHtml = iconv ("gbk", "UTF-8", $ mierHtml );
Echo $ mierHtml;