Analysis of get_meta_tags (), CURL, and user-agent usage in php

Source: Internet
Author: User
This article mainly introduces the usage of get_meta_tags (), CURL, and user-agent in php, and analyzes get_meta_tags () in detail in the form of instances (), CURL and user-agent usage considerations and usage, with

This article mainly introduces the usage of get_meta_tags (), CURL, and user-agent in php, and analyzes get_meta_tags () in detail in the form of instances (), CURL and user-agent usage considerations and usage, with

This document analyzes the usage of get_meta_tags (), CURL, and user-agent in php. Share it with you for your reference. The specific analysis is as follows:

The get_meta_tags () function is used to capture webpages. In the form of tags, and load A one-dimensional array, name is the element subscript, content is the element value, the tag in the above example can get the array: array ('A' => '1 ', 'B' => '2'), Other The tag is not processed, and this function only processesTag deadline, followed And will not continue to process, howeverPrevious It will still be processed.

The user-agent is part of the invisible header information submitted by the browser when requesting a webpage from the server. The header information is an array containing multiple information, such as the local cache directory, cookies. the user-agent is a browser type statement, such as IE, Chrome, and FF.

Capture a webpage today When the tag is used, it always gets a null value, but it is normal to directly view the source code of the webpage. Therefore, it is suspected that the server has set the output based on the header information. First, try to use get_meta_tags () to capture a local file. Then, the local file writes the obtained header information to the file. The result is as follows:

The Code is as follows:

Array (
'Http _ host' => '2017. 168.30.205 ',
'Path' => 'C:/Program Files/Common Files/NetSarang; C:/Program Files/NVIDIA Corporation/PhysX/Common; C: /Program Files/Common Files/Microsoft Shared/Windows Live; C:/Program Files/Intel/iCLS Client/; C:/Windows/system32; C:/Windows; C: /Windows/System32/Wbem; C:/Windows/System32/WindowsPowerShell/v1.0/; C:/Program Files/Intel (R) Management Engine Components/DAL; C: /Program Files/Intel (R) Management Engine Components/EPT; C:/Program Files/Intel/OpenCL SDK/2.0/bin/x86; C: /Program Files/Common Files/Thunder Network/KanKan/Codecs; C:/Program Files/QuickTime Alternative/QTSystem; C:/Program Files/Windows Live/Shared; C: /Program Files/QuickTime Alternative/QTSystem/; % JAVA_HOME %/bin; % JAVA_HOME %/jre/bin ;',
'Systemroot' => 'C:/Windows ',
'Comspec '=> 'C:/Windows/system32/cmd.exe ',
'Pathext' => '. COM;. EXE;. BAT;. CMD;. VBS;. VBE;. JS;. JSE;. WSF;. WSH;. MSC ',
'Windir' => 'C:/Windows ',
'Server _ SIGNATURE '=> '',
'Server _ soft' => 'apache/2.2.11 (Win32) PHP/5.2.8 ',
'Server _ name' => '2017. 168.30.205 ',
'Server _ ADDR '=> '2017. 168.30.205 ',
'Server _ port' => '80 ',
'Remote _ ADDR '=> '192. 168.30.205 ',
'Document _ root' => 'e:/wamp/www ',
'Server _ admin' => 'admin @ admin.com ',
'Script _ filename' => 'e:/wamp/www/user-agent.php ',
'Remote _ port' => '123 ',
'Gateway _ interface' => 'cgi/123 ',
'Server _ Protocol' => 'HTTP/1.0 ',
'Request _ method' => 'get ',
'Query _ string' => '',
'Request _ URI '=>'/user-agent.php ',
'Script _ name' => '/user-agent.php ',
'Php _ SELF '=>'/user-agent.php ',
'Request _ time' = & gt; 1400747529,
)


Sure enough, there is no HTTP_USER_AGENT element in the array. apache did not have UA when sending a request to another server. Then, it checked the information and the get_meta_tags () function could not forge UA, so we can only use other solutions.

Later, I used CURL to obtain the webpage, but it was a little troublesome to use. I first forged the UA and then analyzed it using regular expressions. .

The Code is as follows:

The Code is as follows:

// Initialize a cURL
$ Curl = curl_init ();

// Set the URL you want to capture
Curl_setopt ($ curl, CURLOPT_URL, 'HTTP: // localhost/user-agent.php ');

// Set whether to output the file header to the browser. 0 indicates no output.
Curl_setopt ($ curl, CURLOPT_HEADER, 0 );

// Set UA. The UA of the browser is forwarded to the server. You can also manually specify the value.
Curl_setopt ($ curl, CURLOPT_USERAGENT, $ _ SERVER ['HTTP _ USER_AGENT ']);

// Set the cURL parameter. The result must be returned to the string or output to the screen. 0: returns the BOOL value of the output screen and operation result, and 1 returns the string.
Curl_setopt ($ curl, CURLOPT_RETURNTRANSFER, 1 );
// Run cURL to request the webpage
$ Data = curl_exec ($ curl );

// Close the URL request
Curl_close ($ curl );

// Process the obtained data
Var_dump ($ data );

I hope this article will help you with PHP programming.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.