Performance Comparison between using file_get_content functions and using curl functions to collect images

Source: Internet
Author: User
: This article mainly introduces the performance comparison between using file_get_content series functions and using curl series functions to collect images. For more information about PHP tutorials, see. Because the background car content of a company's car website is mainly from the home of the car, editing colleagues must manually add cars to the home of the car every day, it is too painful. So, to change this situation, as a developer, my task is coming... That is to prepare a function. you only need to paste the url of the car home to automatically fill the data in the form in our background. At present, the basic filling is achieved, however, the corresponding Car album cannot be collected.

I have also done the Image collection function before, but most cars in the car House have many pictures. at the beginning, I plan to use the previous image collection method, that is, use file_get_content to get the content of the url, match the image address, and then use file_get_content to get the content of these image URLs and load them locally. the code is as follows:

 StartTime = $ this-> get_microtime ();} function stop () {$ this-> StopTime = $ this-> get_microtime ();} function spent () {return round ($ this-> StopTime-$ this-> StartTime) * 1000, 1) ;}$ runtime = new runtime (); $ runtime-> start (); $ url = 'http: // response = file_get_contents ($ url); // echo $ rs; exit; preg_match_all ('/(\/pic \/series-s15306 \/289-\ d + \. html)/', $ rs, $ urlArr); $ avalie = array_unique ($ urlArr [0]); $ count = array (); foreach ($ avalie as $ key => $ ul) {$ pattern = '/; preg_match_all ($ pattern, file_get_contents ('http: // car.autohome.com.cn '. $ ul), $ imgSrc); $ count = array_merge ($ count, $ imgSrc [1]);} foreach ($ count as $ k => $ v) {$ data [$ k] = file_get_contents ($ v);} foreach ($ data as $ k = >$ v) {file_put_contents ('. /pic2 /'. time (). '_'. rand (1, {}.'.jpg ', $ v) ;}$ runtime-> stop (); echo "page Execution time :". $ runtime-> spent (). "millisecond ";

The result shows that this method is quite good if there are fewer images and more images .. Local testing is also relatively difficult to run. it is better to say that it will be launched later. After Baidu, I used the curl method to download images. after the test, it did improve, but it still seems a little slow. If php has multiple threads, that would be great...

After some tossing and looking for information, I found that php's curl library can still simulate multithreading, that is, using curl_multi _ * series functions. after rewriting, the code becomes like this:

  

 StartTime = $ this-> get_microtime ();} function stop () {$ this-> StopTime = $ this-> get_microtime ();} function spent () {return round ($ this-> StopTime-$ this-> StartTime) * 1000, 1) ;}$ runtime = new runtime (); $ runtime-> start (); $ url =' http://car.autohome.com.cn /Pic/series-s15306/289.html # pvareaid = 102177 '; $ rs = file_get_contents ($ url ); preg_match_all ('/(\/pic \/series-s15306 \/289-\ d + \. html)/', $ rs, $ urlArr); $ avalie = array_unique ($ urlArr [0]); $ count = array (); foreach ($ avalie as $ key => $ ul) {$ pattern = '/; preg_match_all ($ pattern, file_get_contents (' http://car.autohome.com.cn '. $ Ul), $ imgSrc); $ count = array_merge ($ count, $ imgSrc [1]);} $ handle = curl_multi_init (); foreach ($ count as $ k =>$ v) {$ curl [$ k] = curl_init ($ v); curl_setopt ($ curl [$ k], CURLOPT_RETURNTRANSFER, 1 ); curl_setopt ($ curl [$ k], CURLOPT_HEADER, 0); curl_setopt ($ curl [$ k], CURLOPT_TIMEOUT, 30); curl_multi_add_handle ($ handle, $ curl [$ k]);} $ active = null; do {$ mrc = curl_multi_exec ($ handle, $ active);} while ($ mr C = CURLM_CALL_MULTI_PERFORM); while ($ active & $ mrc = CURLM_ OK) {// This sentence is critical for versions later than php5.3, because this sentence is not provided, curl_multi_select may always return-1, so that it will always die in the loop while (curl_multi_exec ($ handle, $ active) === CURLM_CALL_MULTI_PERFORM ); if (curl_multi_select ($ handle )! =-1) {do {$ mrc = curl_multi_exec ($ handle, $ active);} while ($ mrc = CURLM_CALL_MULTI_PERFORM );}} foreach ($ curl as $ k => $ v) {if (curl_error ($ curl [$ k]) = "") {$ data [$ k] = curl_multi_getcontent ($ curl [$ k]);} curl_multi_remove_handle ($ handle, $ curl [$ k]); curl_close ($ curl [$ k]);} foreach ($ data as $ k => $ v) {$ file = time (). '_'. rand (1000,999 92.16.'.jpg '; file_put_contents ('. /pic3 /'. $ file, $ v);} curl_multi_close ($ handle); $ runtime-> stop (); echo "page Execution time :". $ runtime-> spent (). "millisecond ";

Well, the multi-thread collection is really sour. then, through a series of tests and comparisons, five tests, the curl multi-thread 4 times is faster than file_get_content, and the time is still 3 ~ of file_get_content ~ 5 times. In summary, we will try to use this method for future collection to improve efficiency.

The above describes the performance comparison between using file_get_content series functions and using curl series functions to collect images, including content, and hope to be helpful to PHP tutorials.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.