We will introduce you to Curl multi-threaded instances and their principles. For more information, see
I believe that many people have a headache for the curl_multi family of functions that are not detailed in the php manual. They have fewer documents, and the examples are even simpler for you to learn from. I have also found many web pages, none of them have seen a complete application example.
Curl_multi_add_handle
Curl_multi_close
Curl_multi_exec
Curl_multi_getcontent
Curl_multi_info_read
Curl_multi_init
Curl_multi_remove_handle
Curl_multi_select
Generally, when you want to use these functions, it is obvious that you need to request multiple URLs at the same time, instead of one request in sequence. Otherwise, it is better to call curl_exec cyclically.
The steps are summarized as follows:
Step 1: Call curl_multi_init
Step 2: Call curl_multi_add_handle cyclically
Note that the second parameter of curl_multi_add_handle is the sub-handle from curl_init.
Step 3: continuously call curl_multi_exec
Step 4: Call curl_multi_getcontent cyclically to obtain the result as needed
Step 5: Call curl_multi_remove_handle and call curl_close for each handle.
Step 6: Call curl_multi_close
Here is an example in the PHP manual:
Copy codeThe Code is as follows:
<? Php
// Create a cURL Resource
$ Response = curl_init ();
$ Ch2 = curl_init ();
// Set the URL and corresponding options
Curl_setopt ($ scheme, CURLOPT_URL, "http://www.jb51.net /");
Curl_setopt ($ scheme, CURLOPT_HEADER, 0 );
Curl_setopt ($ ch2, CURLOPT_URL, "http://www.php.net /");
Curl_setopt ($ ch2, CURLOPT_HEADER, 0 );
// Create a cURL handle for Batch Processing
$ Mh = curl_multi_init ();
// Add two handles
Curl_multi_add_handle ($ mh, $ handle );
Curl_multi_add_handle ($ mh, $ ch2 );
$ Active = null;
// Execute the batch processing handle
Do {
$ Mrc = curl_multi_exec ($ mh, $ active );
} While ($ mrc = CURLM_CALL_MULTI_PERFORM );
While ($ active & $ mrc = CURLM_ OK ){
If (curl_multi_select ($ mh )! =-1 ){
Do {
$ Mrc = curl_multi_exec ($ mh, $ active );
} While ($ mrc = CURLM_CALL_MULTI_PERFORM );
}
}
// Close all handles
Curl_multi_remove_handle ($ mh, $ handle );
Curl_multi_remove_handle ($ mh, $ ch2 );
Curl_multi_close ($ mh );
?>
This is almost the same throughout the use process. However, this simple code has a fatal weakness, that is, in the do loop section, it is an endless loop during the whole url request, it can easily cause 100% CPU usage.
Now let's improve it. here we need to use a function named curl_multi_select, which has almost no documentation. Although the C curl library describes select, the interfaces and usage in php are indeed different from those in C.
Change the do section above to the following:
Copy codeThe Code is as follows:
Do {
$ Mrc = curl_multi_exec ($ mh, $ active );
} While ($ mrc = CURLM_CALL_MULTI_PERFORM );
While ($ active and $ mrc = CURLM_ OK ){
If (curl_multi_select ($ mh )! =-1 ){
Do {
$ Mrc = curl_multi_exec ($ mh, $ active );
} While ($ mrc = CURLM_CALL_MULTI_PERFORM );
}
}
Because $ active changes to false only after all url data is accepted, the returned value of curl_multi_exec is used to determine whether there is any data. When there is data, curl_multi_exec is called continuously, if no data is available for the moment, the select phase is started, and the new data can be awakened to continue execution. The advantage here is that there is no unnecessary CPU consumption.
In addition, there are some details that may sometimes be encountered:
Control the timeout time of each request. Use curl_setopt before curl_multi_add_handle:
Curl_setopt ($ ch, CURLOPT_TIMEOUT, $ timeout );
Determine whether the request times out or other errors. Use curl_error ($ conn [$ I]) before curl_multi_getcontent.
Features of this category:
It runs very stably.
Setting a concurrency will always work with this number of concurrent tasks, even if you add tasks through the callback function.
The CPU usage is extremely low, and most of the CPU consumption is on the user's callback function.
The memory usage is high, and the number of tasks is large (the memory occupied by 15 W tasks will exceed 256 MB). You can use the callback function to add tasks and customize the number of tasks.
Maximum bandwidth usage.
Chained tasks. For example, if a task needs to collect data from multiple different addresses, you can use callback in one breath.
The ability to perform multiple attempts and customize the number of times for CURL errors (CURL errors are easily generated at the beginning of high concurrency, and CURL errors may also occur due to network conditions or stability of the other server ).
The callback function is quite flexible and supports multiple types of tasks at the same time (such as downloading files, capturing webpages, and analyzing 404 can be performed simultaneously in a PHP process ).
You can easily customize the task type, such as checking 404 and getting the final url of redirect.
You can set the cache to challenge product operations.
Disadvantages:
You cannot make full use of the multi-core CPU (you can enable multiple processes to solve the problem, and you need to process the task splitting logic on your own ).
The maximum concurrency is 500 (or 512 ?), After testing, it is a CURL internal limit. If the maximum concurrency is exceeded, the return will always fail.
Currently, the resumable upload function is not available.
At present, tasks are atomic and cannot be divided into several parts for separate thread download for a large file.