Implementation of NODEJS learning HTTP data forwarding

Source: Internet
Author: User

Before doing the project, has been used as a JSON file for simulation data, and later found Mock.js, so the use of mock.js, and then feel the data again how to simulate all static data. So I want to use Nodejs to implement a data forwarding function, pull the data on the server locally. Then it was easy to make a data pull function for that project. In recent times, when looking at some blogs, I want to take a few pages of the content of the blog to take a look at the page. Therefore, the previous data pull function was slightly modified encapsulated a bit.

Make a simple data pull demo: Point I see the effect

  

Then I'll outline the implementation of the demo. As a learning record.

The first is the data forwarding module, I encapsulated it, encapsulated into a transdata.js

"Use Strict";varHTTP = require ("http");varStream = require ("stream");varurl = require ("url");varZlib = require ("Zlib");varNoOp =function () {};//two kinds of requestsvarTransdata ={post:function(opt) {Opt.method= "POST";    Main (opt); }, get:function(opt) {if(Arguments.length >= 2 && (typeofArguments[0] = = "string") && (typeofARGUMENTS[1] = = "function") ) {opt={url:arguments[0], success:arguments[1]            }; if(Arguments[2] && (typeofARGUMENTS[2] = = "function") ) {Opt.error= Arguments[2]; }} Opt.method= "Get";    Main (opt); }};

First, the head of this code, is simply done a little encapsulation, encapsulated into two methods, one is get, one is post, but actually two of the final call is the main method. Of these, opt is the parameter to be passed in. The parameters include the URL, the request object, the response object, and so on.

The main method is as follows

//forwarding request Primary LogicfunctionMain (opt) {varoptions, Creq;//res can be either a response object or a writable stream, success and error callbacks after a successful or failed requestOpt.res = ((opt.resinstanceofhttp. Serverresponse) | | (Opt.resinstanceofStream. writable))? Opt.res:NULL; Opt.success= (typeofOpt.success = = "function")?Opt.success:noop; Opt.error= (typeofOpt.error = = "function")?Opt.error:noop; Try{Opt.url= (typeofOpt.url = = "string")? Url.parse (Opt.url):NULL; } Catch(e) {Opt.url=NULL; }    if(!opt.url) {opt.error (NewError ("URL is illegal")); return; } Options={hostname:opt.url.hostname, Port:opt.url.port, Path:opt.url.pathname, Method:opt.met Hod, headers: {' accept-encoding ': ' gzip, deflate ',            ' Accept-language ': ' zh-cn,zh;q=0.8,en;q=0.6,ja;q=0.4,zh-tw;q=0.2 ',            ' User-agent ': ' mozilla/5.0 (Windows NT 6.1; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/43.0.2357.37 safari/537.36 '        }    };//if the req is a readable stream, the pipe connection is used, the data is transferred, and if not, the string is written directly    if(Opt.method = = ' Post ') {        if(Opt.reqinstanceofStream. Readable) {if(Opt.reqinstanceofhttp. Incomingmessage) {options.headers["Content-type"] = opt.req.headers["Content-type"]; options.headers["Content-length"] = opt.req.headers["Content-length"]; } Process.nexttick (function() {opt.req.pipe (creq); })        } Else {            varstr = ((typeofopt.req) = = = "string")? Opt.req: "";Process.nexttick (function() {creq.end (str); })        }    } Else{Process.nexttick (function() {creq.end (); })} Creq= Http.request (Options,function(res) {Reqcallback (opt.res, res, opt.success)}). On (' Error ',function(e) {opt.error (e); });}

First, some error handling of the OPT parameter is performed. Where res can be a response object, or it can be a writable stream. The success and error are the callback after the request succeeds and fails, and then the URL is converted to the object side after use. Because a request is to be initiated in the background, the requested parameter options are required.

After writing the options to determine whether to forward the request is a post or get, if it is a post and the incoming parameter req is a request header or a readable stream, the pipe is directly connected to res, for data transmission. If Req is a string, it is written directly. After the initiating request gets a response, the Reqcallback method is called to process the data.

The Reqcallback method is as follows:

//callback after a successful requestfunctionreqcallback (Ores, res, callback) {if(ores) {Ores.on (' Finish ',function() {callback ();        }); if(Oresinstanceofhttp. Serverresponse) {varOptions = {}; //Copy response header information            if(res.headers) { for(varKinchres.headers) {Options[k]=Res.headers[k]; }} ores.writehead (200, Options);    } res.pipe (ores); } Else {        varSize = 0; varchunks = []; Res.on (' Data ',function(chunk) {size+=chunk.length;        Chunks.push (chunk); }). On (' End ',function () {            varBuffer =Buffer.concat (chunks, size); //if data is compressed with gzip or deflate, it is uncompressed with zlib            if(res.headers && res.headers[' content-encoding ') && res.headers[' content-encoding '].match (/(\ bdeflate\b) | (\bgzip\b)/) {zlib.unzip (buffer,function(err, buffer) {if(!err) {Callback (Buffer.tostring ())}Else{console.log (err); Callback ("");            }                }); } Else{callback (Buffer.tostring ())}}) }}

Data processing is relatively simple, if the res is a response object, directly through the pipe connection, if not, then get the data, if the data is compressed with gzip, then use zlib to decompress, and then put in the callback.

Transdata calls are simpler, like get direct:

function (Result) {})

The data used in my project is forwarded using the POST request, and it is simple and straightforward:

var transdata = require ("Transdata"); var http = require ("http"); Http.createserver (function(req, res) {    Transdata.post ({        req:req,        url:' http://XXX/XX:9000/getdata ',        res:res,        success:function  () {            Console.log ("Success");        },        error:function(e) {            Console.log ("error");     })

Transdata finished, and then back to the above demo implementation up, since there is transdata, access to data is very easy. The code is as follows:

varCreeper =function(req, res, urlobj) {varHeader = Fs.readfilesync (BaseDir + "Header.ejs"). toString (); varContents = Fs.readfilesync (BaseDir + "Contents.ejs"). toString (); varFoot = Fs.readfilesync (BaseDir + "Foot.ejs"). toString (); Res.writehead ($, {' Content-type ': ' Text/html;charset=utf-8 '});    Res.write (Ejs.render (header, {data:ids})); Console.log ("Start collecting data ..."); varCount = 0;  for(vari=0;i<ids.length;i++){        (function(index) {varID =Ids[index]; varNowsource =Source[id]; Transdata.get (Nowsource.url,function(Result) {count++; Console.log (">" "+id+" "get√"); var$ =cheerio.load (Result); var$colum =$ (nowsource.colum); Result= []; $colum. each (function() {Result.push (Nowsource.handle ($ ( This)))                }); if(typeof+nowsource.max = = "number") {result = Result.slice (0, Nowsource.max)} if(result.length) {vardata = {}; Data[id]=result; Result.index=index; varHTML =Ejs.render (contents, {data:data}); HTML= Html.replace (/(\r|\n) \s*/g, "). Replace (/'/g, ' \ \ ')); Res.write ("<script>loadhtml (" +index+ ", ' Dom_" +index+ "', '" +html+ "') </script>"); }                if(Count = =ids.length) {Console.log ("Data acquisition complete.");                Res.end (foot); }})} (i) }};

Get to the data, the data is the HTML information, and the tool that processes the HTML information is Cheerio, uses the same as jquery's selector, uses the cheerio to manipulate the data and obtains the data which oneself need, this does not carry on the repeat. relatively simple.

The GitHub address for the entire project source code is attached:

Https://github.com/whxaxes/node-test/tree/master/server/creeper

Also attach Transdata.js's github address:

Https://github.com/whxaxes/transdata

Be interested in a look.

Nodejs Learning to implement HTTP data forwarding

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.