PHP Curl Multithreading Principle example Detailed _php instance

Source: Internet
Author: User
Let me introduce you to curl multithreading examples and principles. Please advise the wrong place
Believe that many people on the PHP manual vague curl_multi A family of functions headache unceasingly, their documents are few, give the example is simple so that you can not learn, I have been looking for a lot of web pages, have not seen a complete application example.
Curl_multi_add_handle
Curl_multi_close
Curl_multi_exec
Curl_multi_getcontent
Curl_multi_info_read
Curl_multi_init
Curl_multi_remove_handle
Curl_multi_select
In general, when thinking about using these functions, it is clear that the purpose should be to request multiple URLs at the same time, instead of one by one, it is better to loop to tune Curl_exec.
The steps are summarized as follows:
First step: Call Curl_multi_init
Step two: Loop call Curl_multi_add_handle
It is important to note that the second parameter of Curl_multi_add_handle is a sub-handle from Curl_init.
Step three: Continue calling Curl_multi_exec
Fourth step: Loop call Curl_multi_getcontent as needed to get results
Fifth step: Call Curl_multi_remove_handle, and call curl_close for each word handle
Sixth step: Call Curl_multi_close
Here are examples of PHP manuals:
Copy CodeThe code is as follows:
Create a pair of curl resources
$ch 1 = curl_init ();
$ch 2 = Curl_init ();

Set the URL and the appropriate options
curl_setopt ($ch 1, Curlopt_url, "http://www.php.net/");
curl_setopt ($ch 1, curlopt_header, 0);
curl_setopt ($ch 2, Curlopt_url, "http://www.php.net/");
curl_setopt ($ch 2, Curlopt_header, 0);

Create a batch curl handle
$MH = Curl_multi_init ();

Add 2 Handles
Curl_multi_add_handle ($MH, $ch 1);
Curl_multi_add_handle ($MH, $ch 2);

$active = null;
Executing a batch Handle
do {
$MRC = Curl_multi_exec ($MH, $active);
} while ($MRC = = Curlm_call_multi_perform);

while ($active && $MRC = = CURLM_OK) {
if (Curl_multi_select ($MH)! =-1) {
do {
$MRC = Curl_multi_exec ($MH, $active);
} while ($MRC = = Curlm_call_multi_perform);
}
}

Close all handles
Curl_multi_remove_handle ($MH, $ch 1);
Curl_multi_remove_handle ($MH, $ch 2);
Curl_multi_close ($MH);
?>

The whole process is almost like this, but this simple code has a fatal weakness, that is, in the Do loop, during the entire URL request is a dead loop, it can easily lead to CPU consumption of 100%.
Now let's improve it, there is a function curl_multi_select with almost no documentation, although C's Curl Library has a description of select, but the interface and usage in PHP are different from C.
Change the part of the above do to the following:
Copy CodeThe code is as follows:
do {
$MRC = Curl_multi_exec ($MH, $active);
} while ($MRC = = Curlm_call_multi_perform);
while ($active and $MRC = = CURLM_OK) {
if (Curl_multi_select ($MH)! =-1) {
do {
$MRC = Curl_multi_exec ($MH, $active);
} while ($MRC = = Curlm_call_multi_perform);
}
}

Because $active to wait for all the URL data to be completed before it becomes false, so here to use the Curl_multi_exec return value to determine whether there is data, when there is data when the call Curl_multi_exec, There is no data to enter the Select phase, and new data can be woken up and executed. The advantage here is that the CPU is useless.
In addition: There are some details that may sometimes be encountered:
Control the time-out for each request by curl_setopt before Curl_multi_add_handle:
curl_setopt ($ch, Curlopt_timeout, $timeout);
Determine if a time-out or other error was used before Curl_multi_getcontent: Curl_error ($conn [$i]);

Features of this class:
Very stable operation.
Setting a concurrency will always work with this concurrency number, even if adding tasks through the callback function is not affected.
CPU consumption is very low, most of the CPU is consumed on the user's callback function.
Memory utilization is high, the number of tasks is large (15W tasks occupy more than 256M of memory) you can use the callback function to add the task, the number of custom.
To maximize bandwidth consumption.
Chained tasks, such as a task that needs to collect data from multiple different addresses, can be one go by a callback.
The ability to make multiple attempts to curl errors, to customize the number of times (large concurrency is prone to curl errors at first, network conditions or the stability of the other server may also produce curl errors).
The callback function is quite flexible and can be performed at the same time for multiple types of tasks (such as downloading files, crawling Web pages, parsing 404 can be done simultaneously in a PHP process).
It can be very easy to customize the task type, such as checking 404, getting the last URL of redirect, etc.
You can set the cache to challenge the product moral integrity.
Insufficient:
It is not possible to take full advantage of multicore CPUs (which can be solved by multiple processes and need to handle logic such as task splitting).
Maximum concurrent 500 (or 512?) ), the test is the internal limit of curl, and exceeding the maximum concurrency will cause a failure to always return.
There are currently no breakpoint continuation functions.
At present, the task is atomic, can not be divided into a large number of parts of the separate thread download.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.