Goal
Create a Lesson3 project in which to write code.
When accessed in the browser http://localhost:3000/
, the output CNode (https://cnodejs.org/) Community home page of all post titles and links, in the form of JSON
Knowledge Points:
- Learn to crawl Web pages using superagent
- Learn to use Cheerio Analysis Web pages
Library Introduction:
Superagent (http://visionmedia.github.io/superagent/) is a library of HTTP aspects that can initiate a GET or POST request.
Cheerio (Https://github.com/cheeriojs/cheerio) can be understood as a node. js version of jquery, used to extract data from a Web page in CSS selector, using the same way as jquery.
To create a project-related command:
- Create a new folder and go inside
npm init
- Installation dependencies
npm install --save PACKAGE_NAME
- Write application logic
Installation of 2 libraries and express
NMP Install--save expressnmp install--save superagentnmp install--save cheerio
var superagent = require (' superagent '); var express = require (' Express '); var cheerio = require (' Cheerio '); var app = Express (); App.get ('/', function (req, res, next) {Superagent.get (' https://cnodejs.org/'). End (function (err, sres) { if (Err) {return next (ERR); } var $ = cheerio.load (Sres.text); var items = []; Sres.text Store the HTML content of the Web page, pass it to cheerio.load//You can get a variable that implements the JQuery interface, and we habitually name it ' $ '/ /The rest is jquery content $ (' #topic_list. Topic_title '). each (function (idx, Element) {var $element = $ (element); Items.push ({title: $element. attr (' title '), href: $element. attr (' href ') }); }); New $ (' #topic_list. User_avatar '). each (function (idx, Element) {var $element = $ (element); items[idx][' Author ' = $element. attr (' HREF '). Split ('/') [2]}); Res.send (items); });}); App.listen (The function (req, res) {Console.log (' app is running at Port 3000 ');});
node. JS uses superagent and cheerio to complete a simple crawler