Recently encountered a demand, is to from an English sentence analysis page, according to the English clause you entered, click the Start Analysis button, this page will parse the results, such as
Then we need to take these parsed data from this page (including sentence syntax structure, sentence related vocabulary interpretation, etc.) out of the time, I thought I learned node. js, this time to get down to the small crawler node. js.
First, the computer installs node. js, as for how to install, please Google, or to find the relevant tutorials to see.
Then you need to know the next node, now I load the HTTP module, and then set the value of the URL, the URL is the address of the page you want to crawl
And then get the data through Http.get, and now I should paste the code up.
Then I saved it as a crawler_english.js file, and then I ran it on the command line, and I hit node crawler_english.js, and I printed out all the pages without spelling anything unexpected.
The clang of the data will begin to parse.
It is said that parsing DOM structure with Cheerio This module is better, I have NPM install cheerio this module
then var cheerio = require (' Cheerio '); load this module in.
The first thing I want to get is the sentence component analysis, sentence grammar structure detailed, sentence related lexical interpretation, sentence grammar error check and sentence related learning points under the content, this time I will find their ID, after the analysis, the parsing process will not say.
Use node. js to make a small reptile on an English sentence analysis page