Using PHP's Curl library can be a simple and effective way to capture web pages. All you have to do is run a script and then analyze the pages you crawl, and then you can get the data you want in a program. Whether you want to take part of the data from a link, or take an XML file and import it into a database, it's a simple way to get the Web content, CURL is a powerful PHP library. This article mainly describes if you use this PHP library.
Enable CURL settings
First, we need to make sure that our PHP has the library open, and you can get that information by using the Php_info () function.
If you can see the following output on a Web page, then the Curl Library has been opened.
If you see it then you need to set up your PHP and open the library. If you are under the Windows platform, then it is very simple, you need to change your php.ini file settings, find Php_curl.dll, and cancel the previous semicolon comment on the line. As shown below:
Cancel the comment
Extension=php_curl.dll
|
If you are under Linux, then you need to recompile your PHP, edit, you need to open the compilation parameters--The Configure command on the "–with-curl" parameter.
A small example
If everything is ready, here is a small routine:
﹤?php Initializes a CURL object $curl = Curl_init (); Set the URL you want to crawl curl_setopt ($curl, Curlopt_url, ' http://cocre.com '); Set Header curl_setopt ($curl, Curlopt_header, 1); Sets the curl parameter to require that the results be saved to the string or to the screen. curl_setopt ($curl, Curlopt_returntransfer, 1); Run Curl, request Web page $data = curl_exec ($curl); Close URL Request Curl_close ($curl); Show the data you get Var_dump ($data); |
How to post data
This is the code that crawls the page, and the following is the post data to a Web page. Let's say we have a URL http://www.example.com/sendSMS.php that handles the form, it can accept two form fields, one is the phone number, the other is a text message.
﹤?php
$phoneNumber = ' 13912345678 ';
$message = ' This is generated by curl and PHP ';
$curlPost = ' pnumber= ' . UrlEncode ($phoneNumber). ' &message= '. UrlEncode ($message). ' &submit=send ';
$ch = Curl_init ();
curl_setopt ($ch, Curlopt_url, ' http://www.example.com/sendSMS.php ');
curl_setopt ($ch, Curlopt_header, 1);
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_post, 1);
curl_setopt ($ch, Curlopt_postfields, $curlPost);
$data = Curl_exec ();
Curl_close ($ch);
? ﹥
|
From the above program we can see that we set the Post method of the HTTP protocol using Curlopt_post instead of the Get method, and then set the post data in Curlopt_postfields.
About proxy servers
The following is an example of how to use a proxy server. Note that the highlighted code, the code is very simple, I do not have to say more.
﹤?php
$ch = Curl_init ();
curl_setopt ($ch, Curlopt_url, ' http://www.example.com ');
curl_setopt ($ch, Curlopt_header, 1);
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_httpproxytunnel, 1);
curl_setopt ($ch, Curlopt_proxy, ' fakeproxy.com:1080 ');
curl_setopt ($ch, curlopt_proxyuserpwd, ' User:password ');
$data = Curl_exec ();
Curl_close ($ch);
? ﹥
|
About SSL and cookies
As for SSL, the HTTPS protocol, you just have to turn the http://in the Curlopt_url connection into https://. Of course, there is also a parameter called Curlopt_ssl_verifyhost can be set to verify the site.
For cookies, you need to know the following three parameters:
Curlopt_cookie, set a COOKIE in your face-to-face session
Curlopt_cookiejar, save a cookie when the session ends
Curlopt_cookiefile,cookie's file.
HTTP Server Authentication
Finally, let's look at HTTP server authentication.
﹤?php $ch = Curl_init (); curl_setopt ($ch, Curlopt_url, ' http://www.example.com '); curl_setopt ($ch, Curlopt_returntransfer, 1); curl_setopt ($ch, Curlopt_httpauth, Curlauth_basic); curl_setopt (Curlopt_userpwd, ' [Username]:[password] ') $data = Curl_exec (); Curl_close ($ch); ? ﹥ |
For more information, please refer to the relevant Curl manual.