He Qi:
A series of large numbers of data is not hot also want to be a spider crawl of the page, when the spider crawl Peak, response time will be pulled very high.
Predecessors did such a thing: The page is divided into 3 pieces, with 3 internal interfaces provided, the entry file with the Curl_multi_* series function to grab 3 internal interface content, to form a page.
Suspicion of doing so has the potential to affect performance.
Therefore, the study and analysis.
Read the official PHP manual, summarizing the bulk call process as follows:
curl_multi_init- Returns a new curl batch handle as a container for a single curl handle generated by Curl_init
curl_multi_add_handle- adds a separate curl handle to the curl batch session.
curl_multi_exec- A child connection that runs the current CURL handle
curl_multi_getcontent- If set CURLOPT_RETURNTRANSFER
, returns the text stream of the obtained output
curl_multi_remove_handle- Removing a handle resource from the Curl batch handle resource
curl_multi_close- Closing a set of curl handles
Pick a sample code from the PHP website:
<?PHP//Create a pair of curl resources$ch 1=curl_init ();$ch 2=curl_init ();//set the URL and the appropriate optionscurl_setopt ($ch 1, Curlopt_url, "http://lxr.php.net/"); curl_setopt ($ch 1, Curlopt_header, 0); curl_setopt ($ch 2, Curlopt_url, "http://www.php.net/"); curl_setopt ($ch 2, Curlopt_header, 0);//Create a batch curl handle$MH=curl_multi_init ();//Add 2 handlesCurl_multi_add_handle ($MH,$ch 1); Curl_multi_add_handle ($MH,$ch 2);$active=NULL;//executing a batch handle Do { $MRC= Curl_multi_exec ($MH,$active);} while($MRC==curlm_call_multi_perform); while($active&&$MRC==CURLM_OK) { if(Curl_multi_select ($MH)! =-1) { Do { $MRC= Curl_multi_exec ($MH,$active); } while($MRC==curlm_call_multi_perform); }}
Reading data
$content 1 = curl_multi_getcontent ($ch 1);
$content 2 = curl_multi_getcontent ($ch 2);
// Close all handles Curl_multi_remove_handle ($mh$ch 1); Curl_multi_remove_handle ($mh $ch 2); Curl_multi_close ($mh);? >
I have doubts, probably figured out the point:
Analysis of PHP curl_multi_* series functions for bulk HTTP requests