Give full play to the performance of Node. js programs and give full play to node. js

Source: Internet
Author: User

Give full play to the performance of Node. js programs and give full play to node. js

A Node. JS process only runs on a single physical core. This is because you need to pay special attention when developing scalable servers.

Because there are a series of stable APIs and native extended development to manage processes, there are many different methods to design a Node. JS application that can be used in parallel. In this blog post, we will compare these possible architectures.

This article also introduces the compute-cluster module, a small Node. JS library, which can be used to conveniently manage processes.

Problems encountered

We need to be able to process a large number of requests with different features in the Mozilla Persona project, so we try to use Node. JS.

In order not to affect the user experience, the 'interactive 'request we designed only requires lightweight computing consumption, but it provides a faster reflection of the time so that the UI is not stuck. In contrast, the 'batch' operation takes about half a second to process, and there may be a longer delay due to other reasons.


For better design, we have found many solutions that meet our current needs.
Considering scalability and cost, we list the following key requirements:

  • Efficiency: can effectively use all idle Processors
  • Response: Our "application" can respond quickly in real time
  • Elegance: When the Request volume is too large to be processed, we can process it. Errors should be clearly reported if they cannot be handled.
  • Simple: our solutions must be simple and convenient to use.

Through the above points, we can clearly and purposefully Filter
 

Solution 1: directly process in the main thread.

When the main thread processes data directly, the result is poor:

You cannot take full advantage of the multi-core CPU. In interactive requests/responses, you must wait until the current request (or response) is processed.

The only advantage of this solution is that it is simple enough.
 

function myRequestHandler(request, response) [ // Let's bring everything to a grinding halt for half a second. var results = doComputationWorkSync(request.somesuch);}

In the Node. JS program, if you want to process multiple requests at the same time and want to process them simultaneously, you are prepared to handle it.

Method 2: whether to use Asynchronous processing.

If asynchronous methods are used in the background, will the performance be greatly improved?

The answer is not necessarily. It depends on whether the background running is meaningful.

For example, if javascript or local code is used on the main thread for computing, the performance is not better than synchronous processing, so it is not necessary to use Asynchronous processing in the background.

Please read the following code
 

function doComputationWork(input, callback) { // Because the internal implementation of this asynchronous // function is itself synchronously run on the main thread, // you still starve the entire process. var output = doComputationWorkSync(input); process.nextTick(function() {  callback(null, output); });} function myRequestHandler(request, response) [ // Even though this *looks* better, we're still bringing everything // to a grinding halt. doComputationWork(request.somesuch, function(err, results) {  // ... do something with results ... });

}
The key point is that the use of node. js asynchronous APIs does not depend on multi-process applications.

Solution 3: Use the thread library for asynchronous processing.

As long as the implementation is proper and the library implemented using local code can break through the restrictions during NodeJS calls to implement multi-threaded functions.

There are many such examples. The bcrypt library compiled by Nick Campbell is one of the excellent ones.

If you use this library on a 4-Core Machine for a test, you will see a magic scene: 4 times the usual throughput, and almost all the resources are exhausted! However, if you test on a 24-core machine, the result will not change much: the usage of four cores is basically 100%, but the usage of other cores is basically idle.

The problem is that this library uses the NodeJS internal thread pool, which is not suitable for such calculation. In addition, the thread pool cannot be written to any other thread, and a maximum of four threads can be run.

In addition to the write limit, the deeper cause of this problem is:

  • Using the NodeJS internal thread pool for a large number of operations will impede its file or network operations, making the program seem slow to respond.
  • It is difficult to find a proper way to deal with the waiting queue: Imagine if your queue has accumulated a backlog of threads for 5 minutes, do you still want to add threads to it?

In this case, the component library with built-in thread mechanism cannot effectively take advantage of the multi-core advantages, which reduces the response capability of the program. As the load increases, the program performance is getting worse and worse.


Solution 4: cluster module using NodeJS

NodeJS 0.6.x and later versions provide a cluster module that allows you to create a group of processes that share the same socket to share the load pressure.

What would happen if you adopted the above scheme and used the cluster module at the same time?

This solution also has the disadvantages of synchronous processing or built-in thread pool: slow response, no elegance.

Sometimes, simply adding a new running instance does not solve the problem.
 

Solution 5: Introduce the compute-cluster Module

In Persona, our solution is to maintain a set of computing processes with a single (but different) function.

In this process, we have compiled the compute-cluster library.

This library automatically starts and manages sub-processes as needed, so that you can use a local sub-process cluster to process data through code.

Example:
 

const computecluster = require('compute-cluster'); // allocate a compute clustervar cc = new computecluster({ module: './worker.js' }); // run work in parallelcc.enqueue({ input: "foo" }, function (error, result) { console.log("foo done", result);});cc.enqueue({ input: "bar" }, function (error, result) { console.log("bar done", result);});

Fileworker. js responds to the message event and processes the incoming request:
 

process.on('message', function(m) { var output; // do lots of work here, and we don't care that we're blocking the // main thread because this process is intended to do one thing at a time. var output = doComputationWorkSync(m.input); process.send(output);}); 

Without changing the calling code, the compute-cluster module can be integrated with existing asynchronous APIs, so that the minimum amount of code can be used for real multi-core parallel processing.

Let's look at the performance of this solution from four aspects.

Multi-core parallel capability: Sub-processes use all the core.

Response Capability: Because the core management process is only responsible for promoter processes and message transmission, it is idle most of the time and can process more interactive requests.

Even if the load on the machine is very high, we can still use the operating system scheduler to increase the priority of the core management process.

Simplicity: the asynchronous API is used to hide the details of the specific implementation. We can easily integrate the module into the current project, or even call the code without making any changes.

Now let's see if we can find a way. Even if the load suddenly surges, the system's efficiency will not drop abnormally.

Of course, the best goal is to ensure that the system can operate efficiently and handle as many requests as possible even when the pressure surges.


To help achieve an excellent solution, compute-cluster not only manages sub-processes and transmits messages, but also manages other information.

It records the number of currently running sub-processes and the average completion time of each sub-process.

With these records, we can predict the approximate time it takes before the sub-process starts.

Then, with the user-set parameters (max_request_time) added, we can directly close those requests that may time out without processing them.
 

This feature allows you to easily determine your code based on the user experience. For example, "a user login should not wait for more than 10 seconds ." This is roughly equivalent to setting max_request_time to 7 seconds (network transmission time is required ).

After stress testing the Persona service, we were very satisfied with the results.

Under extremely high pressure, we can still provide services for authenticated users, block some unauthenticated users, and display related error messages.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.