node. JS uses superagent and cheerio to complete a simple crawler

Source: Internet
Author: User

Goal

Create a Lesson3 project in which to write code.

When accessed in the browser http://localhost:3000/ , the output CNode (https://cnodejs.org/) Community home page of all post titles and links, in the form of JSON

Knowledge Points:

    1. Learn to crawl Web pages using superagent
    2. Learn to use Cheerio Analysis Web pages

Library Introduction:

Superagent (http://visionmedia.github.io/superagent/) is a library of HTTP aspects that can initiate a GET or POST request.

Cheerio (Https://github.com/cheeriojs/cheerio) can be understood as a node. js version of jquery, used to extract data from a Web page in CSS selector, using the same way as jquery.

To create a project-related command:

    1. Create a new folder and go insidenpm init
    2. Installation dependenciesnpm install --save PACKAGE_NAME
    3. Write application logic

Installation of 2 libraries and express

NMP Install--save expressnmp install--save superagentnmp install--save cheerio

var superagent = require (' superagent '); var express = require (' Express '); var cheerio = require (' Cheerio '); var app = Express             (); App.get ('/', function (req, res, next) {Superagent.get (' https://cnodejs.org/'). End (function (err, sres) {            if (Err) {return next (ERR);            } var $ = cheerio.load (Sres.text);               var items = []; Sres.text Store the HTML content of the Web page, pass it to cheerio.load//You can get a variable that implements the JQuery interface, and we habitually name it ' $ '/ /The rest is jquery content $ (' #topic_list. Topic_title '). each (function (idx, Element) {var $element =                $ (element);                Items.push ({title: $element. attr (' title '), href: $element. attr (' href ')            });            });                New $ (' #topic_list. User_avatar '). each (function (idx, Element) {var $element = $ (element); items[idx][' Author ' = $element. attr (' HREF '). Split ('/') [2]});        Res.send (items); });}); App.listen (The function (req, res) {Console.log (' app is running at Port 3000 ');});

node. JS uses superagent and cheerio to complete a simple crawler

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.