Nodejs data Stream (Stream) Manual

Source: Internet
Author: User
Tags readfile

Nodejs data Stream (Stream) Manual
1. Introduction

This article describes the basic methods for developing programs using node. js streams.

"We should have some ways of connecting programs like garden hose--screw inanother segment when it becomes necessary to massage data inanother way. This is the way of IO also."Doug McIlroy. October 11, 1964

Stream is the first practice that has been used for decades since the beginning of unix. It has proved that Stream can easily develop some large systems. In unix, Stream is implemented through |. In node, as a built-in stream module, many core modules and third-party modules are used. Like unix, the main operations of node Stream are also.pipe()The user can use the anti-press mechanism to control the balance between reading and writing.

Stream can provide developers with unified interfaces that can be reused and control the read/write balance between streams through abstract Stream interfaces.

2. Why Stream?

Node I/O is asynchronous. Therefore, the disk and network read and write operations need to be read through the callback function. Below is a simple code of the file download server:

var http = require('http');var fs = require('fs');var server = http.createServer(function (req, res) {    fs.readFile(__dirname + '/data.txt', function (err, data) {        res.end(data);    });});server.listen(8000);

These codes can implement the required functions, but the service needs to cache the entire file data to the memory before sending the file data. If"data.txt"If the file size is large and the concurrency is large, a lot of memory will be wasted. This is because the user needs to wait until the entire file is cached in the memory to receive the file data, which leads to a poor user experience. But fortunately(req, res)Both parameters are Stream, so we can usefs.createReadStream()Replacefs.readFile():

var http = require('http');var fs = require('fs');var server = http.createServer(function (req, res) {    var stream = fs.createReadStream(__dirname + '/data.txt');    stream.pipe(res);});server.listen(8000);

.pipe()Method listeningfs.createReadStream()Of'data'And'end'Event."data.txt"The entire file does not need to be cached. A data block can be sent to the client immediately after the client connection is complete. Use.pipe()Another advantage is that it can solve the read/write imbalance problem caused by high client latency. If you want to compress the file and try again, you can use the third-party module:

var http = require('http');var fs = require('fs');var oppressor = require('oppressor');var server = http.createServer(function (req, res) {    var stream = fs.createReadStream(__dirname + '/data.txt');    stream.pipe(oppressor(req)).pipe(res);});server.listen(8000);

In this way, the files will be compressed on browsers that support gzip and deflate.oppressorThe module processes allcontent-encoding.

Stream makes development programs easy.

3. Basic Concepts

There are five basic streams: readable, writable, transform, duplex, and "classic ".

3-1. pipe

All types of Stream receiving are used.pipe()Create an input/output pair to receive a readable streamsrcAnd output the data to the writable stream.dst, As follows:

src.pipe(dst)

.pipe( dst )Return MethoddstStream, so that you can use multiple.pipe(), As follows:

a.pipe( b ).pipe( c ).pipe( d )

The function is the same as the following code:

a.pipe( b );b.pipe( c );c.pipe( d );
3-2. readable streams

By calling the Readable streams.pipe()You can write the data of Readable streams to a Writable, Transform, or Duplex stream.

readableStream.pipe( dst )
1> Create a readable stream

Here we create a readable stream!

var Readable = require('stream').Readable;var rs = new Readable;rs.push('beep ');rs.push('boop\n');rs.push(null);rs.pipe(process.stdout);$ node read0.jsbeep boop

Rs. push (null)The notification data recipient data has been sent.

Note that we did not callrs.pipe(process.stdout);But all the data content we press into is completely output, because the readable stream caches all the pushed data before the receiver does not read the data. However, in many cases, the better way is to press the data into a readable stream instead of caching the entire data only when the data receives the requested data. Next let's rewrite it.._read()Function:

var Readable = require('stream').Readable;var rs = Readable();var c = 97;rs._read = function () {    rs.push(String.fromCharCode(c++));    if (c > 'z'.charCodeAt(0)) rs.push(null);};rs.pipe(process.stdout);
$ node read1.jsabcdefghijklmnopqrstuvwxyz

The above code is rewritten_read()The method allows you to only request data from the data recipient to input data to the readable stream._read()Method can also receivesizeThe parameter indicates the size of the data requested by the data request, but the read stream can ignore this parameter as needed.

Note that we can also useutil.inherits()Inherits readable streams. To demonstrate that only when the data recipient requests data_read()The method is called. We make a delay when pushing data into a readable stream, as shown below:

var Readable = require('stream').Readable;var rs = Readable();var c = 97 - 1;rs._read = function () {    if (c >= 'z'.charCodeAt(0)) return rs.push(null);    setTimeout(function () {        rs.push(String.fromCharCode(++c));    }, 100);};rs.pipe(process.stdout);process.on('exit', function () {    console.error('\n_read() called ' + (c - 97) + ' times');});process.stdout.on('error', process.exit);

Run the program with the following command and we find that_read()The method is called only five times:

$ node read2.js | head -c5abcde_read() called 5 times

The timer is used because the system needs time to send a signal to notify the program to close the pipeline. Useprocess.stdout.on('error', fn)It is to process the SIGPIPE signal sent by the system because of the header command to close the pipeline, because this will cause process. stdout to trigger the EPIPE event. To create a Readable stream that can be pushed into any form of data, you only need to set the objectMode parameter to true when creating the stream, for example: Readable ({objectMode: true }).

2> Read readable stream Data

In most cases, we only need to simply use the pipe method to redirect the data of the readable stream to another form of stream, but in some cases, it may be more useful to directly read data from the readable stream. As follows:

process.stdin.on('readable', function () {    var buf = process.stdin.read();    console.dir(buf);});$ (echo abc; sleep 1; echo def; sleep 1; echo ghi) | node consume0.js 
  
   
    
     null
    
   
  

When data in a readable stream can be read, the stream will trigger'readable'Event so that you can call.read()When no data in the read-only stream can be read,.read()The return value is null..read()Wait for the next call.'readable'Event trigger. Below is a usage.read(n)Example of Reading 3 bytes from a standard input each time:

process.stdin.on('readable', function () {    var buf = process.stdin.read(3);    console.dir(buf);});

The following code runs and finds that the output result is incomplete!

$ (echo abc; sleep 1; echo def; sleep 1; echo ghi) | node consume1.js 
     
      
       
      
     

This should be because extra data is left in the internal buffer of the stream, and we need to notify the stream that we want to read more data..read(0)This can be achieved.

process.stdin.on('readable', function () {    var buf = process.stdin.read(3);    console.dir(buf);    process.stdin.read(0);});

The running result is as follows:

$ (echo abc; sleep 1; echo def; sleep 1; echo ghi) | node consume2.js 
        
         
        

We can use.unshift()Re-place the data to the back-to-back data queue header, so that you can continue to read the back-to-back data. The following code outputs the standard input content in line:

var offset = 0;process.stdin.on('readable', function () {    var buf = process.stdin.read();    if (!buf) return;    for (; offset < buf.length; offset++) {        if (buf[offset] === 0x0a) {            console.dir(buf.slice(0, offset).toString());            buf = buf.slice(offset + 1);            offset = 0;            process.stdin.unshift(buf);            return;        }    }    process.stdin.unshift(buf);});$ tail -n +50000 /usr/share/dict/american-english | head -n10 | node lines.js 'hearties''heartiest''heartily''heartiness''heartiness\'s''heartland''heartland\'s''heartlands''heartless''heartlessly'

Of course, many modules can implement this function, such as split.

3-3. writable streams

Writable streams can only be used.pipe()The target parameter of the function. The following code:

src.pipe( writableStream );
1> Create a writable stream

Rewrite._write(chunk, enc, next)The method can accept the data of a readable stream.

var Writable = require('stream').Writable;var ws = Writable();ws._write = function (chunk, enc, next) {    console.dir(chunk);    next();};process.stdin.pipe(ws);$ (echo beep; sleep 1; echo boop) | node write0.js 
          
           
          

First ParameterchunkIs the data written by the data input. Second ParameterendIs the data encoding format. Third Parameternext(err)The callback function notifies the data writer that more time can be written. If readable stream writes a string, the string is convertedBufferIf you set it when creating a streamWritable({ decodeStrings: false })Parameter. If the data written by readable stream is an object, you need to create a writable stream in this way.

Writable({ objectMode: true })
2> write data to writable stream

Call writable stream's.write(data)Method to complete data writing.

process.stdout.write('beep boop\n');

Call the. end () method to notify writable stream that the data has been written.

var fs = require('fs');var ws = fs.createWriteStream('message.txt');ws.write('beep ');setTimeout(function () {    ws.end('boop\n');}, 1000);$ node writing1.js $ cat message.txtbeep boop

If you need to set the buffer size of writable stream, you need to set it when creating the stream.opts.highWaterMarkIn this way, if the data in the buffer exceedsopts.highWaterMark,.write(data)Method returns false. When the buffer zone is writable, writable stream triggers'drain'Event.

3-4. classic streams

Classic streams is an old interface that was first available in node 0.4, but it is still very good to understand its operating principles.
. When a stream is registered"data"When the event is returned to the function, the stream will work in the old version mode, that is, the old API will be used.

1>. classic readable streams

A Classic readable streams event is an event trigger. If Classic readable streams has data that can be read, it triggers"data"Event, which is triggered when data is read."end"Event..pipe()Method Pass Checkstream.readableDetermine whether the stream has data readable. Here is an example of printing A-J letters with Classic readable streams:

var Stream = require('stream');var stream = new Stream;stream.readable = true;var c = 64;var iv = setInterval(function () {    if (++c >= 75) {        clearInterval(iv);        stream.emit('end');    }    else stream.emit('data', String.fromCharCode(c));}, 100);stream.pipe(process.stdout);$ node classic0.jsABCDEFGHIJ

If you want to read data from classic readable stream, register"data"And"end"The following code calls back two events:

process.stdin.on('data', function (buf) {    console.log(buf);});process.stdin.on('end', function () {    console.log('__END__');});$ (echo beep; sleep 1; echo boop) | node classic1.js 
            
             
              __END__
             
            

Note that if you use this method to read data, the benefits of using the new interface will be lost. For example, when writing data to a stream with a very high latency, you need to pay attention to the balance between reading data and writing data. Otherwise, a large amount of data will be cached in the memory, leading to a waste of memory. We strongly recommend that you use the. pipe () method of the stream at this time, so that you do not have to listen to the "data" and "end" events on your own, and you do not have to worry about read/write imbalance. Of course, you can also use through instead of listening to "data" and "end" events, as shown in the following code:

var through = require('through');process.stdin.pipe(through(write, end));function write (buf) {    console.log(buf);}function end () {    console.log('__END__');}$ (echo beep; sleep 1; echo boop) | node through.js 
              
               
                __END__
               
              

Alternatively, you can use concat-stream to cache the content of the entire stream:

var concat = require('concat-stream');process.stdin.pipe(concat(function (body) {    console.log(JSON.parse(body));}));$ echo '{"beep":"boop"}' | node concat.js { beep: 'boop' }

Of course, if you have to listen on your own"data"And"end"Event, you can use.pause()Method: Pause Classic readable streams and continue to trigger the "data" event. It can be used when Data Writing streams are writable..resume()Method: The Notification stream continues to be triggered."data"Continue reading events
Data.

2>. classic writable streams

Classic writable streams is very simple. Only.write(buf),.end(buf)And.destroy()Three methods..end(buf)The buf parameter of the method is optional. If this parameter is selected, it is equivalentstream.write(buf); stream.end()When the buffer of the stream is full, the stream cannot be written..write(buf)The method returns false. If the stream is writable again, the stream triggers the drain event.

4. transform

Transform is a stream that filters the output of read data.

5. duplex

Duplex stream is a readable and writable two-way stream. The following is a duplex stream:

a.pipe(b).pipe(a)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.