In node, Buffer is a widely used class, and this article will analyze its memory policy from the following levels:
- The user level, either node Lib/*.js or your own JS file, calls new Buffer
- Socekt Read/write
- File Read/write
User Buffer
In the Lib/buffer.js module, there is a module private variable pool, which points to the current 8K slab:
Buffer.poolsize = 8 * 1024x768; var Pool; function Allocpool () { new Slowbuffer (buffer.poolsize); = 0;}
Slowbuffer for src/node_buffer.cc export, when the user calls new buffer, if you want to request more space than 8k,node will call Slowbuffer directly, if less than 8K, the new buffer will be based on the current slab:
- The parent member variable of the newly created buffer points to this slab,
- The offset variable points to the offsets in this slab:
if This . Length) Allocpool (); this. Parent = Pool; this. Offset =this. length;
For example, when you need 2K of space: New Buffer (21024), it will check the remaining space of this slab, if there is surplus, it is allocated to you this free space, and the current slab used space used + = 21024 For example, when we call NE twice in a row W Buffer (2*1024):
When we apply for a 5K space again, the current pool is only 4 K available, so node will again apply for a 8K slab, and point the current pool to it, note that at this time the original slab will have 4K space wasted:
At this point the original slab was referenced by two 2K buffer, so when the two buffer references were null, V8 thought that the slab could be destroyed.
Note that if one of our slab is referenced by a 1Byte buffer, the 8K slab will not be recycled, even if all other references have been null:
First let's look at the stream read: In the stream_wrap, this time the strategy is similar to the user layer of new buffer, just slab size to 1MB, at this time we need to consider the socket "read operation" buffer size problem, imagine the following, if Our data length is 30K, and our buffer size is only 2K, which means that we call at least 15 socket read operation, to trigger 15 on ("Data") event, each time we need to pass this event and data from the LIBUV level to the user JS level, which is extremely inefficient, So we need to set a larger buffer in the LIBUV unix/stream.c, when the watcher read event that binds the socket is triggered, the Uv__read function is called, which solidifies the buffer size to 64*1024
= STREAM->ALLOC_CB ((uv_handle_t*) stream, + * 1024x768);
ALLOC_CB defined in stream_wrap.cc
uv_buf_t Streamwrap::onalloc (uv_handle_t* handle, size_t suggested_size)
But in fact we know that our socket read is rarely 64K in size, such as if Nread is only 2k, at this time, in order to avoid waste, we can reset slab_used:
if (handle_that_last_alloced == (Buf.len- nread);}
Please note that we are able to do this because the buffer is allocated when the Read event is detected on the socket, Alloc_cb→socket Read→read callback The process is sequential, without extraneous interference! (I don't understand why node has to add a second judgment if (handle_that_last_alloced = = handle), and you can tell me.)
We see that in the case of the socket read, the management of the buffer is controlled in stream_wrap, the UV STERAM.C performs the read operation, the returned callback function is also defined in the Stream_wrap, and then the read Buffe layer is passed to the user JS , that is, our on ("Data") event, there is no extra memory copy in the process, and it is quite efficient, but there is a problem: if you persist in referencing a stream.read floating buffer, you will cause the 1M slab it references to not be released!
We're looking at Socket.prototype.write, and when you pass in a string, node automatically generates a buffer, and if you're the buffer itself, save this step (note that the user-level new buffer is called):
// Change strings to buffers. SLOWif (typeof data = = ' string'new Buffer (data, encoding);}
Then the buffer corresponding to the pointer will be passed through the layer, until the UV stream.c of the corresponding write function, the process will not have any additional copy operations, especially to note: When you pass directly to a buffer, until the socket.write callback returns the end of the expression, You should not modify it in this process because the underlying is or will be manipulating it!
File read/write
Regular file write and socket are similar, there is no bright spot, we focus on file read.
As to the importance of bufsize size in IO operations, as described above, I remember that Mr. Steven in APUE also has special test results, which we will not repeat here.
In FS. When Readstream, we can pass in some parameters:
{ ' R ', null, null , 0666, * 1024x768}
The default bufsize is 64K, but in lib/fs.js there is also a poolsize control variable:
var kpoolsize = 40 * 1024;
When node finally actually calls Fs.read:
var thispool = pool; var This . buffersize); var start = pool.used;
Node will compare the user's incoming bufsize with the remaining space of the current pool, whichever is smaller, so the default 64*1024 size is never valid.
OK, 40K size can also be accepted, but if you want to read the file is relatively small, such as 1K, 2K level more, then we reserve 40K buf, when read back, in fact, only used to 1 K or 2K, this time, Node will not again like Socket.read, and then Pool.use D minus 39K or 38K, because our actual fs.read operation is performed in another separate thread, that is, buf ALLOC→FS Read→read CB is not sequential, we can no longer re-set pool socket.read like used ! In this case the waste of memory is quite serious!
So when you want to cache a large number of small files, such as a static server, my advice is to allocate a chunk of buffer, and then copy the buffer floating from fs.readstream to our own chunk buffer and do it on this chunk of buffer. Slice generate the corresponding small buffer, so that we do not reference readstream floating buffer, so that it can be V8 recycled, of course, if you have enough memory to splurge, when I said nothing ...
Memory pool
Then look at the bottom node_buffer:
void Buffer::replace (char *data, size_t length, void *hint)
The memory operation of this function is simple:
.... Delete Newchar[length_];
In fact, through the above analysis, a busy network server, it is likely to frequent new/delete 8k/1m memory block, if it is a static file service, there may also be frequent 40K memory block operation, so I tried to add a 8K memory block node memory pool control, Service busy when the hit rate is nearly 100%, unfortunately the overall performance improvement did not meet expectations, in this is not humble, interested students can play their own hack, there is a result can be known to me (Http://weibo.com/windyrobin) ...
Section:
From the above analysis, we know
- Do not easily and persistently refer to Socket.readstream or Fs.readstream floating Buffe
- When you call Stream.Write and pass buffer directly in, you should not modify it until this operation returns.
- When calling Fs.readstream, if you have an estimate of the size of the file, try to pass in a closer bufsize
- When you make a persistent reference to a buffer, even if it has only one byte, it can also cause its dependent slab (possibly 8k/1m ... ) is not released.
Attached: The above analysis based on the node 0.6 series, in this regard, I have submitted a few issue to Node official, developers are on the issue of the above exposure to improve:
Transferred from: https://cnodejs.org/topic/4f16442ccae1f4aa27001067
Node Buffer/stream Memory Policy Analysis