The original is from:http://www.infoq.com/cn/articles/new-idea-of-nodejs-asynchronous-processing-tasks?utm_source=infoq& Utm_medium=popular_links_homepage
node. JS specializes in data-intensive real-time (data-intensive real-time) interaction scenarios. However, data-intensive real-time applications are not just I/O intensive tasks, and when CPU-intensive tasks are encountered, For example, to the data encryption (node.bcrypt.js), data compression and decompression (Node-tar), or to be based on the user's identity to do some personalized image processing, in these scenarios, the main thread is committed to doing complex CPU computation, I/O request queue tasks are blocked.
The event loop of the node. JS main thread executes all the tasks/events in the order of the events queue, so the functions of other callbacks, listeners, timeouts, Nexttick () are not run until any of the tasks/events themselves are completed. Because the Blocked event loop doesn't have a chance to handle them at all, the best thing to do at this point is to slow down, and the worst thing to do is to stay stuck, like dead.
A feasible solution is a new process, through the IPC communication, the CPU-intensive task to the child process, the child process is completed, then through the IPC message to notify the main process, and return the results to the main process.
Compared with the creation of threads, the system resource occupancy rate of new process is large and the communication efficiency between processes is not high. If you can not open a new process, but the new thread, the CPU time-consuming task to a worker to do, and then the main thread returns immediately, processing other I/O requests, wait until the work thread has finished calculating, notify the main thread and return the results to the main thread. Node's main thread is also easy to maintain high responsiveness while facing both I/O-intensive and CPU-intensive services.
As a result, a better solution than the open process is:
- Instead of opening a process, the CPU time-consuming operation is handed over to a worker thread within the process.
- Specific logic support for CPU time-consuming operations is implemented through C + + and JS.
- JS uses this mechanism similar to the use of I/O library, convenient and efficient.
- Runs a standalone V8 vm in a new thread, executes concurrently with the VM of the main thread, and the thread must be hosted by us.
To achieve the above four goals, we added a Backgroundthread thread to node, which will be explained in detail later in the article. On the implementation, a pt_c built-in C + + module is added for node. This module is responsible for encapsulating the CPU time-consuming operations into a task, throwing it to Backgroundthread, and returning immediately. The specific logic is processed in the other thread, and after completion, the result is set and the main thread is notified. This process is very similar to asynchronous I/O requests. Specific logic such as:
node provides a mechanism to give the CPU time-consuming operations to other threads, and then set the result to notify the main thread to execute the callback function after execution is complete. Here is a piece of code to demonstrate this process:
int main () { loop = Uv_default_loop (); int Data[fib_until]; uv_work_t Req[fib_until]; int i; for (i = 0; i < Fib_until; i++) { data[i] = i; Req[i].data = (void *) &data[i]; Uv_queue_work (Loop, &req[i], fib, AFTER_FIB); } Return Uv_run (loop, uv_run_default);}
Where the function uv_queue_work is defined as follows:
Uv_extern int Uv_queue_work (uv_loop_t* loop, uv_work_t* req, uv_work_cb WORK_CB, UV_AFTER_WORK_CB AFTER_WORK_CB);
The parameter WORK_CB is a function pointer executed on a different thread, AFTER_WORK_CB equivalent to the callback function that is executed for the main thread. On the Windows platform, Uv_queue_work eventually calls the API function QueueUserWorkItem to distribute the task, and the thread that eventually executes the task is managed by the operating system, and may not be the same every time. This does not satisfy the fourth article above.
Because we want to support running the JS code in the thread, this need to open a V8 VM, so we need to pin this thread down, specific tasks, only to this thread processing. And once created, there is no task to quit. This requires that we maintain a thread object ourselves and provide an interface that allows the user to easily generate an object and submit it to the task queue for that thread.
A thread object that background thread is created when the module Pt_c is built into the binding. This thread has a taskloop, a task to handle, no task to wait on a semaphore. Multithreading takes into account the problem of synchronization between threads. Thread synchronization occurs only when the incomming queue is read and written to this thread. Node's main thread generates a task, commits it to the thread's incomming queue, and activates the semaphore and returns immediately. In the next loop, Backgroundthread removes all the tasks from the incomming queue, puts them in the working queue, and then executes the task in the working queue in turn. The main thread does not access the working queue and therefore does not require a lock. Doing so can reduce conflicts.
This thread will build a separate V8 VM before entering the Taskloop loop, specifically to execute BACKGROUNDJS code. The main thread of the V8 engine and this thread can be executed in parallel. Its life cycle is consistent with the life cycle of the node process.
Initialization code for Pt_c module void Init (handle<object> target, handle<value> unused, handle<context> Context, void* priv) { //create working thread, focus on cup intensive task if (! Cworkingthread::getinstance (). Start ()) { return; } environment* env = environment::getcurrent (context); Load DLL, including all the cpu-intensive functions node_set_method (target, "Registermodule", registermodule); C11/>node_set_method (Target, "Posttask", posttask); Post a task that run a cpu-intensive function defined in Backgroundjs node_set_method (target, "Jstask", Jstask);}
All CPU time-consuming logic can be put into BACKGROUNDJS, and the main thread is thrown to the worker thread by generating a task, specifying the function and parameters to run. A worker thread calls a function in Backgroundjs during the execution of a task. Backgroundjs is a. js file that adds CPU time-consuming functions to the inside.
Background.js code Example:
var globalfunction = function (v) { var obj; try { obj = Json.parse (v); } catch (e) { return e; } var a = obj.param1; var b = obj.param2; var i;//simulate CPU intensive process ... for (i = 0; i < 95550000; ++i) {i + = +; I-= 100; } return (A + B). ToString ();}
Run node and enter in the console:
var bind = process.binding (' Pt_c '); var obj = {param1:123,param2:456};bind.jstask (' globalfunction '), Json.stringify (obj), function (err, data) { if (err) { Console.log ("err"); } else { console.log (data ); }});
The method called is Bind.jstask and will explain the use of the function later.
Here are the test results:
The above experiment is done as follows:
- First, bind the Pt_c built-in module. The process of binding invokes the module initialization function, in which a new thread is created.
- The CPU time-consuming function in Backgroundjs is called many times in a row, with three consecutive calls in the above experiment.
When the function in Backgroundjs is complete, the main thread is notified, and in a new round of evenloop, the callback function is called to print out the result. This experiment illustrates the asynchronous execution of CPU time-consuming operations.
Method Jstask A total of three parameters, the first two parameters are strings, respectively, the global function name in Background.js, passed to the parameters of the function. The last parameter is a callback function, which is left asynchronously for the main thread to run.
Why do I use strings for arguments?
To accommodate a variety of different parameter types, you need to provide a variety of function implementations for C + + functions, which is very restricted. C + + Gets the function in Backgroundjs based on the function name and passes the argument to JS. In JS, the processing of JSON strings is very easy, so the use of strings, simplified C + + logic, JS can easily generate and parse parameters. For the same reason, the return value of the function in Backgroundjs is also a JSON string.
Support for C + +
In demanding performance scenarios, Pt_c allows loading of a. dll file into the node process, which contains CPU time-consuming operations. JS when loading Pt_c, specify the file name to complete the load.
code example:
var bind = process.binding (' Pt_c '); Bind.registermodule (' Node_pt_c.dll ', ' dllinit ', ' Json to Init '); bind.posttask (' Func_example ', ' Json_param ', function (err, data) { if (err) { Console.log ("err"); } else { Console.log (data); }});
Loading a C + + module is a step more than Backgroundjs, and this step is to call Bind.registermodule. This function is responsible for loading the DLL and is responsible for initializing it. Once successful, no more modules can be loaded. All CPU time-consuming operation functions should be implemented in this DLL file.
Summarize
This article presents the new concept of Backgroundjs, which expands the ability of node. js to address node's short board when dealing with CPU-intensive tasks. This solution makes it possible for developers using node to focus only on the functions in Backgroundjs. More efficient, versatile and consistent than a multi-open process or a new add-on-module solution. Our code is open source and you can download it in https://github.com/classfellow/node/tree/Ansy-CPU-intensive-work--in-one-process.
Support Backgroundjs a stable node version you can download in Http://www.witch91.com/nodejs.rar.
Reference documents
- node. js weaknesses CPU Intensive tasks
- Why you should use node. js for Cpu-bound Tasks,neil kandalgaonkar,2013.4.30;
- Http://nikhilm.github.io/uvbook/threads.html#inter-thread-communication
- In layman node. JS Pauling