PHP bulk collection Download beautiful pictures of implementation Code _php tutorial

Source: Internet
Author: User
Design ideas

Considering the simple collection of a webpage picture, too troublesome, so directly collect his list page, get the URL of the list and then in one by one to collect, but with PHP to match the URL of the list page is too troublesome, the first list page has a lot of invalid URLs it's a problem for me, this regular little rookie, looking at the structure of the list page, A decisive use of jquery to get Url,jquery's Universal selector is again strong again.

jquery Gets the URL, and then Ajax passes the url-> corresponding to the PHP file, traversing the URL parameters--Save the picture with a single page capture

jquery Program
Copy CodeThe code is as follows:




Here, the URL is stitched together as ', ' the string passed the URL, the use of Getjson is for cross-domain needs, about getjson a few common problems can see <$.getjson encounter several problems >

PHP Collection Program
Copy CodeThe code is as follows:
Grab 365 Pics
Error_reporting (e_all ^ e_notice);
Set_time_limit (0);//Set PHP time-out
/**
* Get current time
*/
function Getmicrotime () {

List ($usec, $sec) = Explode ("", Microtime ());
return (float) $usec + (float) $sec);
}
$stime = Getmicrotime ();

$callback = $_get[' callback ');
$hrefs = $_get[' hrefs ');
$urlarray = Explode (', ', $hrefs);

Get all pictures of the specified URL
function Getimgs ($url) {
$dirname = basename ($url, ". php");
if (!file_exists ($dirname)) {
mkdir (' 365/'. $dirname. ');
}
Clearstatcache ();
$data = file_get_contents ($url);
Preg_match_all ("/(HREF|SRC) = ([" | ']?) ([^ "' >]+. (jpg|png| Png| jpg|gif)) \2/i ", $data, $matches);
$matches [3] = Array_unique ($matches [3]);
Unset ($data);
$i = 0;

if (count ($matches [3]) >0) {
foreach ($matches [3] as $k = + $v) {
Simple to determine if it is a standard URL, not a relative path
if (substr ($v, 0,4) = = ' http ') {

$ext = PathInfo ($v, pathinfo_extension);//Picture extension

if (!file_exists (' 365/'. $dirname. ' /'. $k. '. $ext)) {
File_put_contents (' 365/'. $dirname. ' /'. $k. '. $ext, file_get_contents ($v));
$i + +;
}else{
Unset ($v);
}
Clearstatcache ();
}else{
Unset ($v);
}
}
Unset ($matches);
return $i;
}
}

foreach ($urlarray as $k = = $v) {
if ($v! = ") {
$j +=getimgs ($v);
}
}
$etime = Getmicrotime ();
echo "Total acquisition". $j. " Picture ";
echo "Spents". ($etime-$stime). " Seconds ";


Consider the performance issue: the variables used in the Getimgs method are unregistered (unset) after use in order to free up memory.

A few knowledge points designed to

Determine if the standard valid image URL
if (substr ($v, 0,4) = = ' http ') this is simply a match to the image URL is a standard URL, because the captured picture may be a relative path, here I directly give up the collection of this image, of course, you can also restore this image as a standard image path, Another problem is that even the standard URL format, such a picture may not be able to collect, because you do not know whether the picture is still there, perhaps this image URL is invalid, if you want to more strictly determine whether the image URL is true and effective can be recommended to see my previous " There are three ways in which PHP can determine if a remote URL is valid is a valid URL.

Get Picture format

$ext = PathInfo ($v, pathinfo_extension);//Picture extension

Here PathInfo method, summed up there are 7 ways to obtain the format of the file, recommended article: "PHP Seven ways to determine the image format"

Download Save to Local

File_put_contents (' 365/'. $dirname. ' /'. $k. '. $ext, file_get_contents ($v));
The file_put_contents () function writes a string to the file.
Same as calling fopen (), fwrite (), and fclose () in turn.
The file_get_contents () function reads the entire file into a string.

Because the server supports file_get_contents, if the server disable this function, you can use curl, this tool is more powerful than file_get_contents, recommended learning "Curl learning and application (with multithreading)", Can use Curl's multi-threaded download storage, more effective

Purge file Operations cache

The Clearstatcache () function clears the file state cache. The Clearstatcache () function caches the return information of some functions in order to provide higher performance. But sometimes, for example, if you check the same file multiple times in a script and the file is at risk of being deleted or modified during the execution of the script, you need to clear the file state cache to get the correct results. To do this, you need to use the Clearstatcache () function. Official Handbook:

Program Execution Time Calculation

Copy CodeThe code is as follows:
/**

* Get current time

*/

function Getmicrotime () {
List ($usec, $sec) = Explode ("", Microtime ());
return (float) $usec + (float) $sec);
}


can refer to this blog post; Get php page Execution time, database read and write times, function calls and so on "thinkphp"

Finally look at the effect;



409 seconds to collect 214 pictures, about 2 seconds to download a picture, the total size of the picture is about 62M, so it looks like:

One hours 60*60 can download about 1800 photos of beautiful women.

http://www.bkjia.com/PHPjc/327182.html www.bkjia.com true http://www.bkjia.com/PHPjc/327182.html techarticle Design Thinking considering the simple collection of a webpage picture, too troublesome, so directly collect his list page, get the list of the URL and then in one by one acquisition, but with PHP matching list page ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.