Use php to determine whether the webpage is gzip compressed

Source: Internet
Author: User

When a group of friends collected the webpage last night, they found that the webpage obtained by file_get_contents was saved as garbled code locally, and Content-Encoding: gzip in the Response header
But it is normal in the browser.
Because I have some experience, I immediately found that the website opened gzip and file_get_contents obtained compressed pages, rather than decompressed pages (I don't know if I want to request a webpage using file_get_conttents with corresponding parameters to directly obtain webpages not compressed by gzip ?)
I just recently saw that the first two bytes of the file can be read to determine the file type. Friends in the group also said that the first two bytes of the gzip compressed webpage (gbk encoding) are 1F 8B, so they can determine whether the webpage is compressed by gzip.
The Code is as follows: Copy codeThe Code is as follows: // mill military network uses gzip to compress the webpage
// The webpage directly obtained by file_get_contents is garbled.
Header ('content-Type: text/html; charset = UTF-8 ');
$ Url = 'HTTP: // www.miercn.com ';
$ File = fopen ($ url, "rb ");
// Read-only 2 bytes. If the value is (in hexadecimal format) 1f 8b (in hexadecimal format) 31 139, gzip is enabled;
$ Bin = fread ($ file, 2 );
Fclose ($ file );
$ StrInfo = @ unpack ("C2chars", $ bin );
$ TypeCode = intval ($ strInfo ['chars1']. $ strInfo ['chars2']);
$ IsGzip = 0;
Switch ($ typeCode)
{
Case 31139:
// The website has enabled gzip.
$ IsGzip = 1;
Break;
Default:
$ IsGzip = 0;
}
$ Url = $ isGzip? "Compress. zlib: //". $ url: $ url; // ternary expression
$ MierHtml = file_get_contents ($ url); // obtain mill Military Network Data
$ MierHtml = iconv ("gbk", "UTF-8", $ mierHtml );
Echo $ mierHtml;

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.