Parse file_get_contents to imitate the browser header (user_agent) to obtain data. What is useragentUserAgent? the Chinese name is the user agent, or UA for short. it is a special string header that allows the server to identify the operating system and version, CPU type, and CPU used by the customer.
What is user agent?
The Chinese name of the User Agent is the User Agent (UA), which is a special string header, the server can identify the operating system and version, CPU type, browser and version, browser rendering engine, browser language, and browser plug-in used by the customer.
Websites can present different websites by judging different UA, such as mobile phone access and PC access.
When PHP uses the file_get_contents function to collect websites, it can be viewed in a browser, but it cannot obtain any content.
This is probably because the server is configured to determine whether the request is a normal browser request based on User_agent, because the default PHP file_get_contents function does not send ua.
To collect such a website, we must have PHP simulate a browser to send UA, deceiving the website to return normal content.
The implementation is as follows:
Ini_set ('User _ agent', 'mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; 4399Box. 560;. NET4.0C;. NET4.0E )');
This is a simulated UA in IE8 environment. of course you can change it to another one. For example, Firefox
It can also be read as follows:
The code is as follows:
$ Opts = array (
'Http' => array (
'Method' => "GET ",
'Header' => "Host: zh.wikipedia.org \ r \ n ".
"Accept-language: zh-cn \ r \ n ".
"User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; 4399Box. 560;. NET4.0C;. NET4.0E )".
"Accept :*//*"
)
);
The Chinese name of the guest agent User Agent is UA, which is a special string header that allows the server to identify the operating system and version, CPU type, and memory used by the customer...