Download Landscape wallpaper using Nodejs

Source: Internet
Author: User

The previous blog explains the use of Nodejs Crawl Blog Park, this time to everyone is downloading pictures on the network.

The third-party modules that need to be used are:

Superagent

Superagent-charset (manually change the specified encoding, solve GBK Chinese garbled )

Cheerio

Express

Async (concurrency control)

The complete code can be downloaded in my github. The main logical logic is in the netbian.js.

Landscape Wallpaper (http://www.netbian.com/) under the Shore table (http://www.netbian.com/fengjing/index.htm ) As an example to explain.

1. Analyzing URLs

Not hard to find:

Home: Column /index.htm

Pagination: Column /index_ specific page number . htm

Knowing this rule, you can download the wallpaper in bulk.

2. analyze wallpaper thumbnail to find a larger image of the corresponding wallpaper

Using Chrome 's developer tools, you can see that the thumbnail list is in the class= "list" div ,a Label's href The value of the property is the page where the single-sheet wallpaper resides.

Part of the code:

1 Request2 . Get (URL)3. End (function(Err, sres) {4 5         var$ =cheerio.load (sres.text);6         varPic_url = [];//medium Image Link Array7$ ('. List ul ', 0). Find (' Li '). each (function(index, ele) {8             varEle =$ (ele);9             varhref = Ele.find (' a '). EQ (0). attr (' href ');//Medium Image LinkTen             if(href! =undefined) { One Pic_url.push (url_model.resolve (domain, href)); A             } -         }); -});

3.with "http://www.netbian.com/desk/17662.htmThe Continue analysis

Open this page and find the wallpaper displayed on this page, still not the highest resolution.

Click the link in the "Download Wallpaper" button to open a new page.

4.with "http://www.netbian.com/desk/17662-1920x1080.htmThe Continue analysis

Open this page and we will finally download the wallpaper, put in a table inside. For example,http://img.netbian.com/file/2017/0203/bb109369a1f2eb2e30e04a435f2be466.jpg

is the last image we want to download . URL (Behind the scenes BOSS finally showed up. (@ ̄ー ̄@) ).

Download the code for the image:

Request.get (Wallpaper_down_url). End (function(err, img_res) {    if(img_ Res.status = = +)        {//  Save picture content        function(err)            {  If(err) console.log (err);});}    );

Open Browser, Access http://localhost:1314/fengjing

Select columns and pages and click on the "Start" button:

Concurrent request server, download picture.

Complete ~

The contents of the picture are stored in the form of columns + page numbers .

Attach the full picture to download the code:

1 /**2 * Download Image3 * @param {[type]} URL [picture URL]4 * @param {[type]} dir [store directory]5 * @param {[Type]} res [description]6 * @return {[type]} [description]7  */8 varDown_pic =function(URL, dir, res) {9 Ten     varDomain = ' http://www.netbian.com ';//Domain name One  A Request - . Get (URL) -. End (function(Err, sres) { the  -         var$ =cheerio.load (sres.text); -         varPic_url = [];//medium Image Link Array -$ ('. List ul ', 0). Find (' Li '). each (function(index, ele) { +             varEle =$ (ele); -             varhref = Ele.find (' a '). EQ (0). attr (' href ');//Medium Image Link +             if(href! =undefined) { A Pic_url.push (url_model.resolve (domain, href)); at             } -         }); -  -         varCount = 0;//Concurrency Counters -         varwallpaper = [];//Wallpaper Array -         varFetchpic =function(_pic_url, callback) { in  -count++;//Concurrent plus 1 to  +             varDelay = parseint ((math.random () * 10000000)% 2000); -Console.log (' Now concurrency number is: ' + count + ', the URL of the image being crawled is: ' + _pic_url + ' time is: ' + delay + ' milliseconds '); theSetTimeout (function(){ *                 //get the big picture link $ RequestPanax Notoginseng . Get (_pic_url) -. End (function(Err, ares) { the                     var$$ =cheerio.load (ares.text); +                     varPic_down = url_model.resolve (domain, $$ ('. Pic-down '). Find (' a '). attr (' href '));//Large Map Link A  thecount--;//concurrency minus 1 +  -                     //Request a large map link $ Request $ . Get (Pic_down) -. CharSet (' GBK ')//set the encoding to get the Web page in a GBK way -. End (function(Err, pic_res) { the  -                         var$$$ =cheerio.load (pic_res.text);Wuyi                         varWallpaper_down_url = $$$ (' #endimg '). FIND (' img '). attr (' src ');//URL the                         varWallpaper_down_title = $$$ (' #endimg '). FIND (' img '). attr (' Alt ');//title -  Wu                         //Download Large Image - Request About . Get (Wallpaper_down_url) $. End (function(Err, img_res) { -                             if(Img_res.status = = 200){ -                                 //Save Picture Contents -Fs.writefile (dir + '/' + Wallpaper_down_title + path.extname (path.basename (Wallpaper_down_url)), Img_res.body, ' binary ‘,function(err) { A                                     if(Err) console.log (err); +                                 }); the                             } -                         }); $  theWallpaper.push (wallpaper_down_title + ' download completed <br/> '); the                     }); theCallbackNULL, wallpaper);//Return Data the                 }); - }, delay); in         }; the  the         //concurrency is 2, download wallpaper AboutAsync.maplimit (Pic_url, 2,function(_pic_url, callback) { the Fetchpic (_pic_url, callback); the},function(err, result) { theConsole.log (' Success '); +Res.send (Result[0]);//Remove the element labeled 0 -         }); the     });Bayi};

two points to note in particular:

1. The "Shore Desktop" page is encoded "GBK". the Nodejs itself only supports the "UTF-8" encoding. Here we introduce the "superagent-charset" module, which handlesthe encoding of "GBK".

Attach An example from GitHub

Https://github.com/magicdawn/superagent-charset

2. Nodejs is asynchronous, sending a large number of requests at the same time, which may be rejected by the server as a malicious request. Therefore, the introduction of "async" module for concurrent processing, using the method is:maplimit.

Maplimit (arr, limit, iterator, callback)

This method has 4 parameters:

The first 1 parameters are arrays.

The first 2 parameters are the number of concurrent requests.

The first 3 parameters are iterators, which are usually a function.

The 4 parameter is a callback after a concurrent execution.

The function of this method is to bring each element in arr concurrent with the limit to iterator to execute, and the result is passed to the final callback.

Something

This completes the download of the picture.

The complete code, already on GitHub, is welcome to star (☆▽☆).

The writing is limited, the study is shallow, if has the wrong place, welcome the broad Bo Friend to correct.

Download Landscape wallpaper using Nodejs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.