Performance comparison of using File_get_content series functions and using Curl series functions to capture pictures

Source: Internet
Author: User

As the company's car site backstage car content is mainly from the Autohome, editorial colleagues must be manually to the Autohome to add cars every day, it is too painful. So, in order to change this situation, as a development code farming, my task is coming ... That is to be prepared to do a function, as long as the corresponding Autohome URL to paste the data can be automatically populated in our backstage form, the basic fill has been implemented, but still not able to take the corresponding car album in.

The ability to capture pictures I have done before, but autohome most of the cars have a lot of pictures, at the beginning, I intend to use the previous method of collecting pictures, that is, using file_get_content to get the URL corresponding to the content, and then match to the image address, and then use File_get _content get the contents of these image URLs and load them locally, the code is as follows:

<?PHPHeader(' Content-type:text/html;charset=utf-8 ');Set_time_limit(0);classRuntime {var $StartTime= 0; var $StopTime= 0; functionGet_microtime () {List($usec,$sec) =Explode(‘ ‘,Microtime()); return((float)$usec+ (float)$sec); }         functionstart () {$this->starttime =$this-Get_microtime (); }         functionStop () {$this->stoptime =$this-Get_microtime (); }         functionspent () {return round(($this->stoptime-$this->starttime) * 1000, 1); }     }  $runtime=Newruntime (); $runtime-start (); $url= ' http://car.autohome.com.cn/pic/series-s15306/289.html#pvareaid=102177 ';$rs=file_get_contents($url);//echo $rs; exit;Preg_match_all('/(\/pic\/series-s15306\/289-\d+\.html)/',$rs,$URLARR);$avalie=Array_unique($URLARR[0]);$count=Array();foreach($avalie  as $key=$ul) {   $pattern= '/; Preg_match_all($pattern,file_get_contents(' http://car.autohome.com.cn '.$ul),$IMGSRC); $count=Array_merge($count,$IMGSRC[1]);}foreach($count  as $k=$v) {  $data[$k] =file_get_contents($v);}foreach($data  as $k=$v) {  file_put_contents('./pic2/'). Time().‘ _‘.Rand(1, 10000). '. JPG ',$v);}$runtime-Stop (); Echo"Page Execution time:".$runtime->spent (). "MS";

The results found that this method less pictures good, more pictures, that is quite a card. Local testing is also more difficult to run, more than the time to go online. After Baidu, I used the method of curl to download pictures, after testing did improve, but the feeling is still a bit slow, if PHP has multiple threads that how good ...

And after a toss and find information, found that PHP Curl Library can still simulate multi-threaded, that is, the use of the Curl_multi_* series of functions, after rewriting, the code has become this:

  

<?PHPHeader(' Content-type:text/html;charset=utf-8 ');Set_time_limit(0);classRuntime {var $StartTime= 0; var $StopTime= 0; functionGet_microtime () {List($usec,$sec) =Explode(‘ ‘,Microtime()); return((float)$usec+ (float)$sec); }         functionstart () {$this->starttime =$this-Get_microtime (); }         functionStop () {$this->stoptime =$this-Get_microtime (); }         functionspent () {return round(($this->stoptime-$this->starttime) * 1000, 1); }     }  $runtime=Newruntime (); $runtime-start (); $url= ' http://car.autohome.com.cn/pic/series-s15306/289.html#pvareaid=102177 ';$rs=file_get_contents($url);Preg_match_all('/(\/pic\/series-s15306\/289-\d+\.html)/',$rs,$URLARR);$avalie=Array_unique($URLARR[0]);$count=Array();foreach($avalie  as $key=$ul) {   $pattern= '/; Preg_match_all($pattern,file_get_contents(' http://car.autohome.com.cn '.$ul),$IMGSRC); $count=Array_merge($count,$IMGSRC[1]);}$handle=curl_multi_init ();foreach($count  as $k=$v) {  $curl[$k] = Curl_init ($v); curl_setopt ($curl[$k], Curlopt_returntransfer, 1); curl_setopt ($curl[$k], Curlopt_header, 0); curl_setopt ($curl[$k], Curlopt_timeout, 30); Curl_multi_add_handle ($handle,$curl[$k]);}$active=NULL; Do {    $MRC= Curl_multi_exec ($handle,$active);}  while($MRC==curlm_call_multi_perform); while($active&&$MRC==CURLM_OK) {//This sentence after the php5.3 version is critical, because there is no such sentence, maybe Curl_multi_select will return forever-1, so it will die in the loop forever. while(Curl_multi_exec ($handle,$active) ===curlm_call_multi_perform); if(Curl_multi_select ($handle)! =-1) {         Do {            $MRC= Curl_multi_exec ($handle,$active); }  while($MRC==curlm_call_multi_perform); }}foreach($curl  as $k=$v) {    if(Curl_error ($curl[$k]) == "") {        $data[$k] = Curl_multi_getcontent ($curl[$k]); } curl_multi_remove_handle ($handle,$curl[$k]); Curl_close ($curl[$k]);}foreach($data  as $k=$v) {    $file= Time().‘ _‘.Rand(1000, 9999). '. jpg; file_put_contents('./pic3/').$file,$v); }curl_multi_close ($handle);$runtime-Stop (); Echo"Page Execution time:".$runtime->spent (). "MS";

Well, multi-threaded collection is really very sour, and then through a series of tests and comparisons, 5 tests, curl multithreading has 4 times is faster than file_get_content, and time is file_get_content of three times, summed up, In the future, the collection will try to use this method to improve efficiency.

Performance comparison of using File_get_content series functions and using Curl series functions to capture pictures

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.