Reasons to Choose Curl
With regard to curl and file_get_contents, excerpt a plain and easy comparison:
File_get_contents is actually a combination of built-in file manipulation functions, such as file_exists,fopen,fread,fclose, specifically for lazy people, and it's primarily used against local files, but also because of lazy people, At the same time, added to the network file support;
Curl is a library dedicated to network interaction, providing a bunch of custom options for dealing with different environments, which are naturally more stable than file_get_contents.
How to use
1, Open Curl support
Because the PHP environment is installed by default is not open curl support, you need to modify the php.ini file, find, Extension=php_curl.dll, the previous colon removed, restart the service can;
2, the use of Curl data capture
Copy Code code as follows:
Initializes a CURL object
$curl = Curl_init ();
Set the URL you want to crawl
curl_setopt ($curl, Curlopt_url, ' http://www.cmx8.cn ');
Set Header
curl_setopt ($curl, Curlopt_header, 1);
Sets the curl parameter to require that the results be saved to the string or to the screen.
curl_setopt ($curl, Curlopt_returntransfer, 1);
Run Curl, request Web page
$data = curl_exec ($curl);
Close URL Request
Curl_close ($curl);
3, through the regular match to find the key data
Copy Code code as follows:
$data is the value returned by the curl_exec, that is, the target content of the collection
Preg_match_all ("/<li class=\" item\ ">" (. *?) <\/li>/", $data, $out, Preg_set_order);
foreach ($out as $key => $value) {
Here $value is an array, and records find the whole sentence with matching characters and the individual matching characters
Echo ' match to the whole sentence: '. $value [0]. '
';
Echo ' alone matched to: '. $value [1]. '
';
}
Skills
1, timeout related settings
by curl_setopt ($ch, opt) You can set some time-out settings, including:
Curlopt_timeout sets the maximum number of seconds that curl is allowed to execute.
Curlopt_timeout_ms sets the maximum number of milliseconds that the curl allows to execute. (Joined in the Curl 7.16.2.) Available from PHP 5.2.3. )
Curlopt_connecttimeout the time to wait before initiating the connection, and if set to 0, wait indefinitely.
Curlopt_connecttimeout_ms the time, in milliseconds, that the attempt to connect waits. If set to 0, wait indefinitely. Be joined in the Curl 7.16.2. Available starting from PHP 5.2.3.
Curlopt_dns_cache_timeout sets the time to save DNS information in memory by default of 120 seconds.
Copy Code code as follows:
curl_setopt ($ch, curlopt_timeout, 60); You just need to set a number of seconds to
curl_setopt ($ch, curlopt_nosignal, 1); Note that the millisecond timeout must be set for this
curl_setopt ($ch, Curlopt_timeout_ms, 200); Timeout millisecond, joined in CURL 7.16.2. Available from PHP 5.2.3
2. Submit data by post, keep cookies
Copy Code code as follows:
The following excerpt an example to learn from:
Curl Analog Login Discuz program, suitable for DZ7.0
!extension_loaded (' curl ') && die (' The curl extension is not loaded. ');
$discuz _url = ' http://www.lxvoip.com ';//Forum Address
$login _url = $discuz _url. /logging.php?action=login ';//Login page address
$get _url = $discuz _url. /my.php?item=threads '; My posts
$post _fields = Array ();
The following two items do not need to be modified
$post _fields[' loginfield '] = ' username ';
$post _fields[' loginsubmit '] = ' true ';
User name and password must be filled in
$post _fields[' username '] = ' lxvoip ';
$post _fields[' password '] = ' 88888888 ';
Security Questions
$post _fields[' QuestionID '] = 0;
$post _fields[' answer '] = ';
@todo Verification Code
$post _fields[' seccodeverify '] = ';
Get Form Formhash
$ch = Curl_init ($login _url);
curl_setopt ($ch, Curlopt_header, 0);
curl_setopt ($ch, Curlopt_returntransfer, 1);
$contents = curl_exec ($ch);
Curl_close ($ch);
Preg_match ('/<input\s*type= "hidden" \s*name= "Formhash" \s*value= "(. *?)" \s*\/>/i ', $contents, $matches);
if (!empty ($matches)) {
$formhash = $matches [1];
} else {
Die (' not found the Forumhash ');
}
Post data, getting cookies
$cookie _file = dirname (__file__). '/cookie.txt ';
$cookie _file = Tempnam (' tmp ');
$ch = Curl_init ($login _url);
curl_setopt ($ch, Curlopt_header, 0);
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_post, 1);
curl_setopt ($ch, Curlopt_postfields, $post _fields);
curl_setopt ($ch, Curlopt_cookiejar, $cookie _file);
Curl_exec ($ch);
Curl_close ($ch);
Take the cookie above and get the content of the page that you need to log in to view
$ch = Curl_init ($get _url);
curl_setopt ($ch, Curlopt_header, 0);
curl_setopt ($ch, Curlopt_returntransfer, 0);
curl_setopt ($ch, Curlopt_cookiefile, $cookie _file);
$contents = curl_exec ($ch);
Curl_close ($ch);
Var_dump ($contents);