PHP Crawl Remote Picture tutorial

Source: Internet
Author: User
Before doing login development time, found that the avatar image does not have a suffix, the traditional image capture method does not work, need special crawl processing. So, later, the various situations together, encapsulated into a class, shared out.

Create a project

As a demonstration, we create a project grabimg at the WWW root, creating a class grabimage.php and a index.php.

Writing class Code

We define a class with the same file name: Grabimage

Class grabimage{}

Property

Next, define several properties that you need to use.

1, first define a need to crawl the image address: $img _url

2, then define a $file_name used to store the name of the file, but do not carry the extension name, because it may involve the extension of the name replacement, so here to disassemble the definition

3, followed by expansion of the name $extension

4, then we define a $file_dir, the function of this property is that the remote image fetch to the local storage directory, generally relative to the PHP portal file location as the starting point. However, this path is not typically saved to the database.

5, finally we define a $save_dir, as the name implies, the path is used to directly save the database directory. As explained here, we do not directly store the file save path to the database, usually in order to later if the system migration, easy to change the path to prepare. Our $save_dir here is usually the date + filename, if need to use time to remove, in front of the required path.

Method

The property was finished, and then we began to crawl the work formally.

First, we define a method of opening up getinstances used to get some data, such as grabbing the image address, and saving the path locally. It is also placed in the attribute.

Public Function getinstances ($img _url, $base _dir) {    $this->img_url = $img _url;    $this->save_dir = Date ("Ym"). ' /'. Date ("D"). ' /'; For example: 201610/19/    $this->file_dir = $base _dir. '/'. $this->save_dir. '/';//For example:./uploads/image/2016/10/19/}

Picture save path Stitching complete, below we should pay attention to a problem, whether the directory exists. The date goes on a day, but the directory is not created automatically. Therefore, before saving the picture, we need to check first, if the current directory does not exist we need to create it immediately.

We create the Set directory method Setdir. Property we set the private, secure

/** * Check to see if the picture needs to be persisted * if not present, create a directory immediately * @return bool */private function Setdir () {    if (!file_exists ($this->file_ Dir)    {        mkdir ($this->file_dir,0777,true);    }    $this->file_name = Uniqid (). rand (10000,99999);//filename, here is just a demo, please use your own unique file name generation method in the actual project    return true;

The next step is to grab the core code

The first step to solve a problem, we need to crawl the picture may not have a suffix name. According to the traditional grasping method, it is not feasible to grab the picture first and then intercept the suffix name.

We have to use other methods to get the image type. The way is to get the file header information from the file stream information, so as to determine the file MIME information, you can know the file suffix name.

For convenience, first define a mime and file extension mapping.

$mimes =array (    ' image/bmp ' = ' bmp ',    ' image/gif ' = ' gif ',    ' image/jpeg ' = ' jpg ', '    image/ PNG ' = ' png ',    ' image/x-icon ' = ' ico ';

So, when I get the type is image/gif, I can know is the. gif picture.

Use PHP function get_headers to get the file stream header information. When the value is not false, we assign it to the variable $headers

The value of the Content-type is the MIME value.

if ($headers =get_headers ($this->img_url, 1)!==false) {    //Gets the type of response    $type = $headers [' Content-type '];}

Using the mapping table we defined above, we can easily get the suffix name.

$this->extension= $mimes [$type];

Of course, the above-obtained $type, may not exist in our mapping table, indicating that this type of file is not what we want, just abandon it, do not care about it.

The following steps are the same as traditional crawl files.

$file _path = $this->file_dir. $this->file_name. ".". $this->extension;//get the data and save $contents=file_get_contents ($this->img_url); if (File_put_contents ($file _path, $ Contents) {    //The value returned here is the path + file name that is saved directly to the database, as follows: 201610/19/57feefd7e2a7ay5p7lspqai-ly1bf.jpg    return $this Save_dir. $this->file_name. ".". $this->extension;}

First get the local save picture to the full path $file_path, next use file_get_contents fetch the data, then use File_put_contents to save to the file path just.

Finally we return a path that can be saved directly to the database, not the file storage path.

The full version of the Crawl method is:

private function getremoteimg () {//MIME and extension Mappings $mimes =array (' image/ BMP ' + ' bmp ', ' image/gif ' + ' gif ', ' image/jpeg ' + ' jpg ', ' image/png ' = ' png ', ' image/x    -icon ' = ' ico '); Gets the response header if (($headers =get_headers ($this->img_url, 1)) {//Gets the type of the response $type = $headers [' Content        -type '];            If it conforms to the type we want if (Isset ($mimes [$type]) {$this->extension= $mimes [$type]; $file _path = $this->file_dir. $this->file_name. ".".            $this->extension;            Obtain data and save $contents =file_get_contents ($this->img_url); if (file_put_contents ($file _path, $contents)) {//The value returned here is directly saved to the database path + file name, as in the form: 201610/19/57fee Fd7e2a7ay5p7lspqai-ly1bf.jpg return $this->save_dir. $this->file_name. ".".            $this->extension; }}} return false;} 

Finally, for the sake of simplicity, we want to be able to fetch one of these methods in other places. Therefore, we put the grab action directly into the getinstances, after the configuration of the path, the direct fetch, so, in the initialization configuration method getinstances new code.

if ($this->setdir ()) {    return $this->getremoteimg ();} else{    return false;}

Test

Let's go to the index.php file we just created and try it out.

 
  GetInstances ($img _url, $base _dir);? >

Yes, it's crawling over here.

Full code

  * @link bidianer.com */class grabimage{/** * @var string The address of the remote picture to be crawled * For example: Http://www.bidianer.com/img/icon_m    ugs.jpg * Some remote file paths may not have an extension * shape such as: http://www.xxx.com/img/icon_mugs/q/0 */private $img _url; /** * @var string need to save file name * Fetch to local filename will regenerate name * However, without extension * For example: 57FEEFD7E2A7AY5P7LSPQAI-LY1BF * * pri    Vate $file _name;    /** * @var The extended name of the string file * Here directly using the remote image extension * For remote pictures without an extension name, it gets from the file stream * For example:. jpg */private $extension; /** * @var string file saved in local directory * The path here is the path of the PHP save file * Generally relative to the path saved by the portal file * For example:./uploads/image/201610/19/*    However, this path is not normally stored directly in the database */private $file _dir;     /** * @var string database saved files directory * This path is saved directly to the picture path of the database * General Direct Save date + file name, need to use the time to spell on the front path * This is to facilitate the migration of the system when the path is easy to change    * For example: 201610/19/*/private $save _dir; /** * @param string $img The image address _url need to crawl * @param string $base _dir locally saved path, such as:./uploads/image, and finally no slash "/" * @retur N Bool|int */Public function GetiNstances ($img _url, $base _dir) {$this->img_url = $img _url; $this->save_dir = Date ("Ym"). ' /'. Date ("D"). ' /'; For example: 201610/19/$this->file_dir = $base _dir. '/'. $this->save_dir. '/';    For example:./uploads/image/2016/10/19/return $this->start (); }/** * starts to grab the picture */Private Function Start () {if ($this->setdir ()) {return $th        Is->getremoteimg ();        } else {return false;        }}/** * Check if the picture needs to be persisted * if not present, create a directory immediately * @return BOOL */Private Function Setdir () {        if (!file_exists ($this->file_dir)) {mkdir ($this->file_dir,0777,true);    } $this->file_name = Uniqid (). rand (10000,99999);//filename, here is just a demo, please use your own unique file name in the actual project to generate the method return true;    }/** * Crawl the remote image core method, you can simultaneously crawl the picture with the suffix name and the picture without suffix * * @return bool|int */Private Function getremoteimg () {//MIME and extension mappings $mimes =arrAy (' image/bmp ' = ' bmp ', ' image/gif ' = ' gif ', ' image/jpeg ' = ' jpg ', ' ima '        Ge/png ' = ' png ', ' image/x-icon ' = ' ico '; Gets the response header if (($headers =get_headers ($this->img_url, 1)) {//Gets the type of the response $type =$            headers[' Content-type '];                If it conforms to the type we want if (Isset ($mimes [$type]) {$this->extension= $mimes [$type]; $file _path = $this->file_dir. $this->file_name. ".".                $this->extension;                Obtain data and save $contents =file_get_contents ($this->img_url); if (file_put_contents ($file _path, $contents)) {//The value returned here is directly saved to the database path + filename, shape: 201610/ 19/57feefd7e2a7ay5p7lspqai-ly1bf.jpg return $this->save_dir. $this->file_name. ".".                $this->extension;    }}} return false; }}
  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.