Performance optimizations
Performance optimization
Original: https://strongloop.com/strongblog/performance-node-js-v-0-12-whats-new/
January, 2014/in Community, node. js v0.12, strongnode/by Ben Noordhuis
The long development cycle of the node. JS v0.12 Version (9 months and continues, the longest so far) gives the core team and contributors ample opportunity to introduce some performance optimizations. The purpose of this blog is to cover most of the main points.
writable Stream supports Cork mode :
Writable stream now supports the "corked" mode, similar to the socket options for Tcp_cork and Tcp_nopush.
When in corked mode, the data written to the stream is queued until the stream is uncorked. In this way, node. js merges small write data, reducing system calls and TCP round trips.
The HTTP module has been upgraded to use corked mode transparently when sending chunked-mode requests or replies. If you use Strace to view the output, you will notice that the system calls are Writev and write less.
TLS Performance improvements:
The TLS module has many modifications in the v0.12.
In v0.10, the TLS module is located on the upper layer of the net module, transparently decrypting the network stream data. This layer is the way engineers like it, but leads to some costs--strictly unnecessary memory movement and V8 VMS inside and outside calls--and therefore need to be optimized.
This is why the v0.12 version of the TLS module uses the LIBUV directly. It now receives the network stream data directly and decrypts it without having to pass through the middle tier.
The lack of rigorous testing using empty keys shows that TLS is 10% faster than before and colleagues consume less memory. (I should note that the reduction in memory may be due in part to improvements in memory management.) This is another optimization of v0.12. )
(also, if you want to know, then the null key is not the secret key to encrypt the target data to measure the underlying and protocol costs).
Most changes to the TLS module should be transparent to the end user. One of the most visible is that the TLS connection now inherits from TLS. Tlssocket, not TLS. CryptoStream.
Performance improvements for Crypto:
There are several cipher algorithms that should be faster than before, sometimes quite fast. Here's some information:
node. JS's cryptography system uses the OpenSSL library. Some of the implementations of the algorithms in OpenSSL are written in C, and there are some compilations for special platforms and architectures.
The v0.10 version already used the assembly and made a lot of extensions. In addition, Aes-ni is used in case of CPU support. Most x86 CPUs have been supported in the last three or four years.
Linux system, if grep ^flags/proc/cpuinfo | Grep-w AES This command finds any matching results, then your system supports AES-NI. Note that software similar to VMware or VirtualBox may hide certain capabilities of the CPU, including Aes-ni.
The expected result of opening aes-ni is that industrial strength passwords such as aes128-gcm-sha256 are now faster than unencrypted passwords, such as null-md5!
Reduce the tension of garbage collection:
Part of the effect of multi-context refactoring (Multi-context refactoring) is that it reduces the persistent handle in the node. JS Core.
Persistent handle is a strong reference to objects within the V8 heap and is not recycled by GC until the reference is deleted. (GC considers it to be artificial GC root)
node. JS's persistent handle is used to cache frequently used values, such as strings and object prototypes. However, persistent handle requires a GC to do special post-processing, which results in a linear increase in the cost of the handle quantity.
As part of the multi-line cleanup effort, quite a few persistent handle have been removed, or switched to a lighter mechanism (called ' permanent handle ' eternal handles)
The real result is that your program uses less time on the GC and has more time to actually work. Now, in the--prof output, V8::internal::globalhandles::P ostgarbagecollectionprocessing () should display less information.
Better cluster performance:
See the previous article.
Faster timer, faster setimmediate (0, Faster Process.nexttick ():
SetTimeout () and related functions now use a faster and less time source for clock offsets. This optimization is available on all platforms, but the earlier Linux platform has further read the current time from the VDSO, thus greatly reducing the number of system calls for both Gettimeofday and Clock_gettime.
Setimmediate () and Process.nexttick () also have performance adjustments: In general, Distribution adds a quick path. The above function is pretty fast, but now it's faster.
New features of the "translated" node. JS v0.12-Performance optimization