These two days found a call to see the site. It is a known suly, and the crawler uses node. js. Here for the previous blog in the United States play small reptile, instead of nodejs to achieve it. Experience the powerful node. js.
Assuming that you haven't worked with JavaScript before, it's best to go to http://www.codecademy.com/for JavaScript and JQuery, get acquainted with basic grammar at a high speed, and have enough time for other languages.
Have a basic understanding of the post. You will find two main features of JavaScript:
- Object-oriented programming is accomplished using a prototype-based (prototype) approach .
- Functional Programming , interested in the functional formula recommended Racket (original PLT Scheme).
node. JS is a JavaScript execution-time platform based on the Google V8 engine that makes it easy to write high-speed, extensible Web applications. node. JS takes event-driven. Non-clogging I/O model. This makes it lightweight and efficient. Ideal for performing data-intensive real-time applications on distributed devices. With node. js when this execution occurs. JavaScript does not have to be executed in the browser. The big increase, for example, the following small crawler, crawling is the "card House" HD Download Link:
First install the two libraries in the current project folder with the following command-line command://NPM Install request//npm Install Cheeriovar request = require ("request"); Request is used for requesting data var Cheerio = require ("Cheerio"); Cherrio is using jquery syntax to parse htmlvar URL = "http://www.yyets.com/resource/28793"; Request (URL, function (error, response, body) { if (!error && response.statuscode = = =) { var $ = Cheeri O.load (body); $ (' [type= ' ed2k "]). each (function () { var link = $ (this). attr (' href '); if (typeof)! = ' undefined ' && link.indexof ("1024x768") >-1) { console.log (link);});} ); /Name the file Download.js (or whatever you like)//Open command line form to run (PowerShell recommended under Windows):// node download.js > link.txt//Pass Over-directed output, the download link is stored in link.txt this text file//tip: Hold down the "shift" key, the current folder in the white space right there will be open command line Options//tip: Recommended Sublime editor. Install JS format and terminal plugin
"Address: http://blog.csdn.net/thisinnocence/article/details/40404219"
node. js Crawler Bulk Download US drama from everyone movie Hr-hdtv