PHP can use the file_get_content () function to crawl Web content, but it cannot do more complex processing, such as uploading or downloading files, Cookie manipulation, and so on. And PHP's CURL provides these features.
First, Curl Introduction
CURL is an extended library of PHP. It can connect and communicate with various types of servers, using various types of protocols.
It currently supports HTTP, HTTPS, FTP, Gopher, Telnet, dict, file, and LDAP protocols, as well as HTTPS authentication, HTTP POST, FTP upload, proxy, cookies and username + password Authentication.
Second, Curl function library
Common functions
Function |
Describe |
Curl_init () |
Initializing a CURL session |
curl_setopt () |
Set CURL options |
Curl_exec ()
|
Perform a CURL session |
Curl_getinfo () |
Get current session information |
Curl_errno () |
Returns the last error code |
Curl_error () |
Returns the last error string for the current session |
Curl_close () |
Turn off the CURL session |
Other functions
Function |
Describe |
Curl_copy_handle () |
Copy a CURL handle and all of its options. |
Curl_escape () |
Returns an escaped string that is URL-encoded for the given string. |
Curl_file_create () |
Creates a Curlfile object. |
Curl_multi_add_handle () |
Adds a separate curl handle to the curl batch session. |
Curl_multi_close () |
Closes a set of CURL handles. |
Curl_multi_exec () |
The child connection that runs the current CURL handle. |
Curl_multi_getcontent () |
If Curlopt_returntransfer is set, the text stream of the obtained output is returned. |
Curl_multi_info_read () |
Gets the related transport information for the currently resolved CURL. |
Curl_multi_init () |
Returns a new CURL batch handle. |
Curl_multi_remove_handle () |
Removes a handle resource from the CURL batch handle resource. |
Curl_multi_select () |
Wait for active connections in all CURL batches. |
Curl_multi_setopt () |
Set up a batch cURL transfer option. |
Curl_multi_strerror () |
Returns a string literal that describes the error code. |
Curl_pause () |
Pause and resume the connection. |
Curl_reset () |
Resets all options for the Libcurl session handle. |
Curl_setopt_array () |
Bulk setup options for the CURL transfer session. |
Curl_share_close () |
Close the CURL shared handle. |
Curl_share_init () |
Initializes a cURL shared handle. |
Curl_share_setopt () |
Sets the CURL transfer option for a shared handle. |
Curl_strerror () |
Returns a string description of the error code. |
Curl_unescape () |
Decodes a URL-encoded string. |
Curl_version () |
Gets the CURL version information. |
III. realization of the process
1. Initializing a CURL session
2. Set CURL options
3. Performing a CURL session
4. Get CURL information and/or error messages (this step is not possible)
5. Turn off the CURL handle
The most complex is the 2nd step, CURL has a lot of settings options, the following will be combined with examples to understand.
Iv. Instance 1:get request
The process of a GET request is the general flow of CURL.
Prepare a test script index.php in the local server localserver.com root directory, as follows:
<?php $url = ' http://www.baidu.com '; Initialize, get a curl handle $ch = Curl_init (); Set options curl_setopt ($ch, Curlopt_url, $url);//Request URL curl_setopt ($ch, Curlopt_returntransfer, 1);//Return data stream, Without direct output curl_setopt ($ch, Curlopt_header, 0);//No response header header curl_setopt ($ch, Curlopt_connecttimeout, 30);// Connection timeout, seconds //execution and get return $output = curl_exec ($ch); if ($output = = = False) { $output = ' CURL error: '. Curl_error ($ch); } Release CURL handle Resource curl_close ($ch); Print_r ($output);? >
Browser access to the local server home page localserver.com/index.php, display Baidu home page.
V. Example 2. POST request
A POST request requires two options:
curl_setopt ($ch, Curlopt_post, 1); Indicates the POST request curl_setopt ($ch, Curlopt_postfields, $postData)); Post submission Data
Prepare a script index.php for receiving at the remote server remoteserver.com root directory, as follows:
<?php $input = file_get_contents (' php://input '); Echo $input;? >
Then write the script index.php for the POST request at the local server localserver.com root directory, as follows:
<?php $url = ' http://remoteserver.com/index.php '; $data = Array ( ' fname ' = ' Daniel ', ' lname ' = ' Stenberg ' ); Initialize $ch = Curl_init (); Set options curl_setopt ($ch, Curlopt_url, $url); curl_setopt ($ch, Curlopt_returntransfer, 1); curl_setopt ($ch, Curlopt_header, 0); curl_setopt ($ch, Curlopt_connecttimeout,); curl_setopt ($ch, Curlopt_post, 1); Post request curl_setopt ($ch, Curlopt_postfields, Http_build_query ($data));//post data. Use Http_build_query () to convert to "&" stitching string //execute and get return content $output = curl_exec ($ch); if ($output = = = False) { $output = ' CURL error: '. Curl_error ($ch); } Release CURL handle Resource curl_close ($ch); Print_r ($output);? >
The browser accesses the localserver.com/index.php, which appears as follows:
Fname=daniel&lname=stenberg
Vi. Example 3. Uploading files
The idea of CURL uploading files is to add the "@" symbol in front of the file path and upload it in the request field. The background can get uploaded file information through $_files. However, after PHP5.6, the "@" symbol is abolished, you can use the Curlfile class to implement the upload.
Prepare a script index.php for receiving at the remote server remoteserver.com root directory, as follows:
<?php $action = $_post[' action ']; if ($action = = ' Uploadimage ') { $name = $_files[' file ' [' name ']; $tmpname = $_files[' file ' [' Tmp_name ']; Save to the directory where the current script is located move_uploaded_file ($tmpname, DirName (__file__). /'. $name); $error = $_files[' file ' [' Error ']; Switch ($error) {case 0:echo ' upload succeeded '; Case 1:echo ' file size exceeds php.ini limit '; break; Case 2:echo ' file size exceeds form max_file_size limit '; break; Case 3:echo ' file part was uploaded '; break; Case 4:echo ' No files were uploaded '; break; Case 6:echo ' Unable to find temp folder '; break; Case 7:echo ' file write Failed '; break; Default: $output = ' unknown error '; } }? >
Then prepare a picture file test.jpg and CURL upload script index.php in the local server localserver.com root directory, with the following script:
<?php $url = ' http://remoteserver.com/index.php '; $file = Realpath (GETCWD (). '/test.jpg '); $data = Array ( ' action ' = ' uploadimage ', ' file ' = ' @ '. $file ); if (Version_compare (php_version, ' 5.6.0 ') > 0) { $data [' file '] = new Curlfile ($file); } Initialize $ch = Curl_init (); Set options curl_setopt ($ch, Curlopt_url, $url); curl_setopt ($ch, Curlopt_returntransfer, 1); curl_setopt ($ch, Curlopt_header, 0); curl_setopt ($ch, Curlopt_connecttimeout,); curl_setopt ($ch, Curlopt_post, 1); curl_setopt ($ch, Curlopt_postfields, $data); Executes and gets the return content $output = curl_exec ($ch); if ($output = = = False) { $output = ' CURL error: '. Curl_error ($ch); } Release CURL handle Resource curl_close ($ch); Print_r ($output);? >
The browser accesses the localserver.com/index.php, which appears as follows:
Upload successful
Look at the remote server root directory and find a picture that you just uploaded.
Vii. Example 4. Download file
One idea for curl downloading files is to set the Curl option Curlopt_file to a file pointer, which associates the requested resource file to a file stream, which is typically the return value of the fopen () function. By using a file stream to write remote files locally, you can avoid possible memory errors when you write (download) large files.
In the local server localserver.com root directory to write test script index.php, the contents are as follows:
<?php $url = ' http://remoteserver.com/test.jpg '; $file = './test.jpg '; $fp = fopen ($file, ' W '); Initialize $ch = Curl_init (); Set options curl_setopt ($ch, Curlopt_url, $url); curl_setopt ($ch, Curlopt_returntransfer, 1); curl_setopt ($ch, Curlopt_header, 0); curl_setopt ($ch, Curlopt_connecttimeout,); curl_setopt ($ch, Curlopt_file, $fp); File stream for transfer, default is stdout //execute and get return content $output = curl_exec ($ch); if ($output = = = False) { $output = ' CURL error: '. Curl_error ($ch); } Gets the downloaded size $size _download = Curl_getinfo ($ch, curlinfo_size_download); Release Resource fclose ($FP); Curl_close ($ch); if ($size _download && $size _download = = FileSize ($file)) { echo "download succeeded"; } else { echo "download failed or incomplete"; } ? >
The browser accesses the localserver.com/index.php, which appears as follows:
Download successful
View the local server root directory and discover the images downloaded to the remote.
Viii. Example 5. Batch Processing
CURL has a batch handle that enables asynchronous batching, like "multithreading," by opening multiple curl handles and binding those handles to a batch handle and then sequentially processing each CURL connection in a loop.
In the local server localserver.com root directory to write test script index.php, the contents are as follows:
<?php $urls = Array (' http://www.baidu.com ', ' http://www.qidian.com '); $count = count ($urls); $ch = Array (); Create a batch curl handle $MH = Curl_multi_init (); Initializes each curl and sets the option to bind to the batch handle for ($i = 0; $i < $count; $i + +) {$ch [$i] = Curl_init (); curl_setopt ($ch [$i], Curlopt_url, $urls [$i]); curl_setopt ($ch [$i], Curlopt_returntransfer, 1); curl_setopt ($ch [$i], Curlopt_header, 0); curl_setopt ($ch [$i], curlopt_connecttimeout, 30); Curl_multi_add_handle ($MH, $ch [$i]); }//execute batch $running = null; Do {usleep (10000);//delay 0.01 seconds in one out of 10,000 seconds curl_multi_exec ($MH, $running);//Asynchronous implementation batch, similar to "multithreading"} while ($runn ing > 0); Gets the response of each Curl $res = array (); for ($i = 0; $i < $count; $i + +) {$res [$i] = Curl_multi_getcontent ($ch [$i]); }//Close all handles for ($i = 0; $i < $count; $i + +) {Curl_multi_remove_handle ($MH, $ch [$i]); } curl_multi_close ($MH); Print_r ($res);?>
Browser access localserver.com/index.php, display the "connection" of the Baidu home page and the beginning of the homepage.