Get_meta_tags (), Curl and user-agent usage analysis in PHP, curlagent_php tutorial

Source: Internet
Author: User

Get_meta_tags (), Curl and user-agent usage analysis in PHP, curlagent


This paper analyzes the usage of get_meta_tags (), Curl and user-agent in PHP. Share to everyone for your reference. The specific analysis is as follows:

The Get_meta_tags () function is used to crawl labels in the form of a Web page and load a one-dimensional array, name is an element subscript, content is an element value, and the label in the previous example obtains an array: Array (' A ' = ' 1 ', ' b ' = ' 2 '), the other label is not processed, and the function is processed only to the label, and the subsequent processing is no longer processed, but the previous one is processed.

User-agent is part of the invisible header information that the browser submits when requesting a Web page from the server, and the header information is an array containing multiple information, such as a local cache directory, cookies, etc., where user-agent is a browser type declaration such as IE, Chrome, FF, etc. .

Today in crawling a page of Label, always get empty value, but directly to view the source code is normal, and then doubt whether the server set the header information to determine the output, first try to use Get_meta_tags () to fetch a local file, and then this local file will get the header information to write to the file, the result is as follows, It is replaced by/, convenient to view, the code is as follows:
Copy CodeThe code is as follows: Array (
' Http_host ' = ' 192.168.30.205 ',
' PATH ' = ' C:/Program Files/common files/netsarang; C:/Program Files/nvidia Corporation/physx/common; C:/Program Files/common files/microsoft shared/windows Live; C:/Program Files/intel/icls client/; C:/windows/system32; C:/windows; C:/windows/system32/wbem; c:/windows/system32/windowspowershell/v1.0/; C:/Program Files/intel/intel (R) Management Engine components/dal; C:/Program Files/intel/intel (R) Management Engine components/ipt; C:/Program Files/intel/opencl sdk/2.0/bin/x86; C:/Program Files/common Files/thunder network/kankan/codecs; C:/Program Files/quicktime Alternative/qtsystem; C:/Program Files/windows live/shared; C:/Program Files/quicktime alternative/qtsystem/; %java_home%/bin;%java_home%/jre/bin; ',
' SystemRoot ' = ' c:/windows ',
' COMSPEC ' = ' c:/windows/system32/cmd.exe ',
' Pathext ' = '. COM;. EXE;. BAT;. CMD;. VBS;. VBE;. JS;. JSE;. WSF;. WSH;. MSC ',
' windir ' = ' c:/windows ',
' Server_signature ' = ',
' Server_software ' = ' apache/2.2.11 (Win32) php/5.2.8 ',
' server_name ' = ' 192.168.30.205 ',
' Server_addr ' = ' 192.168.30.205 ',
' Server_port ' = ' 80 ',
' REMOTE_ADDR ' = ' 192.168.30.205 ',
' Document_root ' = ' e:/wamp/www ',
' Server_admin ' = ' admin@admin.com ',
' Script_filename ' = ' e:/wamp/www/user-agent.php ',
' Remote_port ' = ' 59479 ',
' Gateway_interface ' = ' cgi/1.1 ',
' Server_protocol ' = ' http/1.0 ',
' Request_method ' = ' GET ',
' Query_string ' = ',
' Request_uri ' = '/user-agent.php ',
' Script_name ' = '/user-agent.php ',
' Php_self ' = '/user-agent.php ',
' Request_time ' = 1400747529,
)
Sure enough in the array there is no http_user_agent this element, Apache when sending a request to another server is not UA, after checking the information, get_meta_tags () function does not forge the ability of UA, so can only use other solutions.

Later, using curl to obtain, it gets to the Web page, but the use of a little trouble, first Forge UA, obtained after the use of regular expression analysis .

To forge a method, the code is as follows:
Copy the Code Code as follows://Initialize a CURL
$curl = Curl_init ();

Set the URL you need to crawl
curl_setopt ($curl, Curlopt_url, ' http://localhost/user-agent.php ');

Sets whether to output the file header to the browser, 0 does not output
curl_setopt ($curl, Curlopt_header, 0);

Set UA, this is to forward the browser's UA to the server, or you can manually specify the value
curl_setopt ($curl, curlopt_useragent, $_server[' http_user_agent ');

Sets the curl parameter, which requires the result to be returned to the string or output to the screen. 0 output the screen and return the bool value of the result of the operation, 1 returns the string
curl_setopt ($curl, Curlopt_returntransfer, 1);
Run Curl, request a Web page
$data = curl_exec ($curl);

Close URL Request
Curl_close ($curl);

Processing the obtained data
Var_dump ($data);

I hope this article is helpful to everyone's PHP programming.

http://www.bkjia.com/PHPjc/928218.html www.bkjia.com true http://www.bkjia.com/PHPjc/928218.html techarticle php get_meta_tags (), Curl and user-agent usage analysis, curlagent This article analyzes the get_meta_tags (), Curl and user-agent usage in PHP. Share to everyone for your reference. Specific analysis ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.