This article to share the content is PHP in the Curl function concurrency to reduce the back-end access time, has a certain reference value, the need for friends can refer to
First, let's look at the Curl multithreading function in PHP:
# curl_multi_add_handle
# curl_multi_close
# curl_multi_exec
# curl_multi_ GetContent
# curl_multi_info_read
# curl_multi_init
# curl_multi_remove_handle
# Curl_multi_select
in general, when thinking about using these functions, it is clear that the purpose should be to request multiple URLs at the same time, not one by one order, otherwise it is better to loop to tune Curl_exec. The
steps are summarized as follows:
First step: Call Curl_multi_init
Step two: Loop call Curl_multi_add_handle
This step needs to be noted, curl_multi_add_ The second parameter of handle is a sub-handle from Curl_init.
Step Three: Continue calling Curl_multi_exec
Fourth step: Loop call Curl_multi_getcontent as needed to get results
Step Fifth: Call Curl_multi_remove_handle, and call Curl_close
Sixth step for each character handle: Call Curl_multi_close
Here is a simple example of what the author calls the dirty example, (I'll explain why Dirty later):
/*here ' s A quick and dirty example for Curl-multi from PHP, tested on PHP 5.0.0rc1 cli/freebsd 5.2.1*/$connomains = Arra Y ("http://www.cnn.com/", "http://www.canada.com/", "http://www.yahoo.com/"); $mh = Curl_multi_init (); foreach ($ Connomains as $i = $url) { $conn [$i]=curl_init ($url); curl_setopt ($conn [$i],curlopt_returntransfer,1); Curl_multi_add_handle ($MH, $conn [$i]);} do {$n =curl_multi_exec ($MH, $active),} while ($active), foreach ($connomains as $i + = $url) { $res [$i]=curl_multi_ GetContent ($conn [$i]); Curl_close ($conn [$i]);} Print_r ($res);
The whole process is almost like this, but this simple code has a fatal weakness, that is, in the Do loop, during the entire URL request is a dead loop, it can easily lead to CPU consumption of 100%.
Now let's improve it, there is a function curl_multi_select with almost no documentation, although C's Curl Library has a description of select, but the interface and usage in PHP are different from C.
Change the part of the above do to the following:
do { $MRC = curl_multi_exec ($MH, $active); } while ($MRC = = curlm_call_multi_perform); while ($active and $MRC = = CURLM_OK) { if (Curl_multi_select ($MH)! =-1) {do { $MRC = curl_multi_exec ($MH, $a ctive); } while ($MRC = = Curlm_call_multi_perform);} }
Because $active to wait for all the URL data to be completed before it becomes false, so here to use the Curl_multi_exec return value to determine whether there is data, when there is data when the call Curl_multi_exec, There is no data to enter the Select phase, and new data can be woken up and executed. The advantage here is that the CPU is useless.
In addition: There are some details that may sometimes be encountered:
Control the time-out for each request by curl_setopt before Curl_multi_add_handle:
curl_setopt ($ch, Curlopt_timeout, $timeout);
Determine if a time-out or other error was used before Curl_multi_getcontent: Curl_error ($conn [$i]);
Here I simply use the above example of the dirty (enough, not to find the CPU uses 100% of the situation).
Simulating concurrency for a "kandian.com" interface is to read and write data to Memcache. Because of the confidentiality of the relationship, the relevant data and results are not posted.
Simulated 3 times, the first 10 threads request 1000 times at the same time, the second, 100 threads request 1000 times at the same time, the third, 1000 threads simultaneously request 100 times (already quite laborious, dare not set more than 1000 multithreading).
It seems that Curl multithreading simulation concurrency still has some limitations.
It is also doubtful that a large error in the results caused by multithreading delays can be compared to data discovery. The time used to initialize and set is not very large, the difference is in the Get method, so it is easy to exclude this point ~ ~ ~
Normally, curl in PHP is blocked, that is, to create a curl request must wait until it succeeds or time out to execute the next request, the Curl_multi_* series function makes concurrent access possible, PHP documentation for this function is not very detailed, use the following:
-
$requests = Array (' http://www.baidu.com ', ' http://www.google.com '); $main = Curl_multi_init (); $results = Array (); $errors = Array (); $info = Array (); $count = count ($requests); for ($i = 0; $i < $count; $i + +) {$handles [$i] = Curl_init ($requests [$i]); Var_dump ($requests [$i]); curl_setopt ($handles [$i], Curlopt_url, $requests [$i]); curl_setopt ($handles [$i], Curlopt_returntransfer, 1); Curl_multi_add_handle ($main, $handles [$i]); } $running = 0; do {curl_multi_exec ($main, $running); } while ($running > 0); for ($i = 0; $i < $count; $i + +) {$results [] = Curl_multi_getcontent ($handles [$i]); $errors [] = Curl_error ($handles [$i]); $info [] = Curl_getinfo ($handles [$i]); Curl_multi_remove_handle ($main, $handles [$i]); } curl_multi_close ($main); Var_dump ($results); Var_dump ($errors); Var_dump ($info);
in our normal program inevitably appear at the same time access to several interfaces, usually when we use curl access, generally is a single, sequential access, if there are 3 interfaces , each interface takes 500 milliseconds then our three interface will take 1500 milliseconds, this problem is too headache to seriously affect the page access speed, there is no possibility of concurrent access to improve speed? Today simply say, the use of curl concurrency to improve page access speed, I hope you have more guidance. 1, old curl access and time-consuming statistics
-
<?php function Curl_fetch ($url, $timeout =3) {$ch = Curl_init (); curl_setopt ($ch, Curlopt_url, $url); curl_setopt ($ch, Curlopt_timeout, $timeout); curl_setopt ($ch, Curlopt_returntransfer, 1); $data = curl_exec ($ch); $errno = Curl_errno ($ch); if ($errno >0) {$data = false; } curl_close ($ch); return $data; } function Microtime_float () {list ($usec, $sec) = Explode ("", Microtime ()); return (float) $usec + (float) $sec); } $url _arr=array ("Taobao" = "http://www.taobao.com", "Sohu" = "http://www.sohu.com", "Sina" =&G t; " Http://www.sina.com.cn ",); $time _start = Microtime_float (); $data =array (); foreach ($url _arr as $key = + $val) {$data [$key]=curl_fetch ($val); } $time _end = Microtime_float (); $time = $time _end-$time _start; echo "Time-consuming: {$time}";
time: 0.614 seconds
2, curl concurrent access and time-consuming statistics
<?php function Curl_multi_fetch ($urlarr =array ()) {$result = $res = $ch =array (); $nch = 0; $MH = Curl_multi_init (); foreach ($urlarr as $nk = + $url) {$timeout = 2; $ch [$nch] = Curl_init (); Curl_setopt_array ($ch [$nch], array (curlopt_url = $url, Curlopt_header = False, CURL Opt_returntransfer = true, Curlopt_timeout = $timeout,)); Curl_multi_add_handle ($MH, $ch [$nch]); + + $nch; }/* Wait for performing request */do {$MRC = Curl_multi_exec ($MH, $running); } while (curlm_call_multi_perform = = $MRC); while ($running && $MRC = = CURLM_OK) {//wait for network if (Curl_multi_select ($MH, 0.5) ; -1) {//pull in new data; do {$MRC = Curl_multi_exec ($MH, $running); } while (curlm_call_multi_perform = = $MRC); } } if ($MRC! = CURLM_OK) {error_log ("CURL Data error"); }/* Get data */$nch = 0; foreach ($urlarr as $moudle = + $node) {if ($err = Curl_error ($ch [$nch]) = = = ") {$res [$nch]=CU Rl_multi_getcontent ($ch [$nch]); $result [$moudle]= $res [$nch]; } else {error_log ("curl error"); } curl_multi_remove_handle ($MH, $ch [$nch]); Curl_close ($ch [$nch]); + + $nch; } curl_multi_close ($MH); return $result; } $url _arr=array ("Taobao" = "http://www.taobao.com", "Sohu" = "http://www.sohu.com", "Sina" =&G t; " Http://www.sina.com.cn ",); function microtime_float () {list ($usec, $sec) = Explode ("", Microtime ()); return (float) $usec + (float) $sec); } $time _start = Microtime_float (); $data =curl_multi_fetch ($url _arr); $time _end = Microtime_float (); $time = $time _end-$time _starT echo "Time-consuming: {$time}"; ?>
Time: 0.316 seconds
Handsome. The entire page accesses the backend interface for half the time saved
3. Curl Related Parameters
Curl_close-close a CURL session
Curl_copy_handle-copy a CURL handle along with all of its preferences
Curl_errno-return the last Error number
Curl_error-return A string containing the last error for the current session
Curl_exec-perform a CURL session
Curl_getinfo-get information regarding a specific transfer
Curl_init-initialize a CURL session
Curl_multi_add_handle-add a normal CURL handle to a CURL multi handle
Curl_multi_close-close a set of CURL handles
Curl_multi_exec-run the sub-connections of the current CURL handle
Curl_multi_getcontent-return the content of a CURL handle if Curlopt_returntransfer is set
Curl_multi_info_read-get information about the current transfers
Curl_multi_init-returns a new CURL multi handle
Curl_multi_remove_handle-remove a multi handle from a set of CURL handles
Curl_multi_select-wait for activity on any curl_multi connection
Curl_setopt_array-set multiple options for a cURL transfer
Curl_setopt-set an option for a cURL transfer
Curl_version-gets CURL Version Information