Node. js cleverly implements hot update of Web application code _ node. js

Source: Internet
Author: User
This article describes Node. the main implementation principle of js Code hot update is how to handle module objects, that is, manually listen to file modifications, then clear module cache, and re-mount the module, the thinking is clear and meticulous. Although there is a bit of redundant code, we recommend it to you. Background

I believe that those who have developed Web applications using Node. js will be troubled by the issue that the new Code must be restarted before it can be updated. Those who are used to PHP development are even more difficult to use. I am still the best programming language in the world. Manual restart is not just an annoying repetitive task. When the application scale is a little large, the start time cannot be ignored.

Of course, as a programmer, no matter which language you use, it will not make such a thing torment yourself. The most direct and universal way to solve these problems is to listen to file modification and restart the process. This method has already provided many mature solutions, such as the node-supervisor that has been abandoned and the popular PM2, or lightweight node-dev, and so on.

This article provides another idea. You only need to make minor changes to achieve real zero restart and hot code updates. This solves the annoying code update problem when Node. js develops a Web application.

General idea

Speaking of hot code updates, the most famous feature is the hot update function of Erlang. This language features high concurrency and distributed programming, the main application scenarios are securities transactions, game servers, and other fields. These scenarios require more or less the Service to have the means of O & M in operation, and code hot update is a very important part. Therefore, we can first briefly understand Erlang's practices.

Since I have never used Erlang either, the following content is just an example. If you want to have an in-depth and accurate understanding of the hot update Implementation of Erlang code, it is best to refer to the official documentation.

Erlang code loading is managed by a module named code_server. In addition to some necessary code at startup, most of the Code is loaded by code_server.
When the code_server detects that the module code is updated, it will re-load the module. Subsequent new requests will be executed using the new module, while the original requests that are still being executed will continue to be executed using the old module.
The old module will be tagged with the old tag after the new module is loaded, and the new module is the current tag. In the current hot update, Erlang will scan and kill the old module, and then update the module according to the logic.
Not all codes in Erlang allow hot updates. For example, basic modules such as kernel, stdlib, and compiler do not allow update by default.
We can find that Node. js also has modules similar to code_server, that is, the require system. Therefore, Erlang's practice should also be able to make some attempts on Node. js. By understanding Erlang's practice, we can summarize the key points in Node. js to solve hot code updates.

How to update module code
How to use the new module to process requests
How to release resources of old modules

Then we will resolve these problems one by one.

How to update module code

To solve the module code update problem, We need to read the module manager Implementation of Node. js and directly link module. js. Through a simple reading, we can find that the core code is Module. _ load, which is a little simplified and pasted out.

// Check the cache for the requested file.// 1. If a module already exists in the cache: return its exports object.// 2. If the module is native: call `NativeModule.require()` with the// filename and return the result.// 3. Otherwise, create a new module for the file and save it to the cache.// Then have it load the file contents before returning its exports// object.Module._load = function(request, parent, isMain) { var filename = Module._resolveFilename(request, parent); var cachedModule = Module._cache[filename]; if (cachedModule) { return cachedModule.exports; } var module = new Module(filename, parent); Module._cache[filename] = module; module.load(filename); return module.exports;};require.cache = Module._cache;

We can find that the core of the Module is Module. _ cache. As long as the Module cache is cleared, the Module manager will reload the latest code in the next request.

Write a small program to verify it

// main.jsfunction cleanCache (module) { var path = require.resolve(module); require.cache[path] = null;}setInterval(function () { cleanCache('./code.js'); var code = require('./code.js'); console.log(code);}, 5000);// code.jsmodule.exports = 'hello world';

Execute main. js and modify the code. js content to find that the code on the console is successfully updated to the latest code.

The module manager has fixed the problem of updating the Code. Next, let's look at how to make the new module executable in Web applications.

How to use the new module to process requests

In order to better meet everyone's usage habits, we will take Express as an example to solve this problem. In fact, using similar ideas, most Web applications can be applied.

First of all, if our service is like the DEMO of Express and all code is in the same module, we cannot perform hot loading on the module.

var express = require('express');var app = express();app.get('/', function(req, res){ res.send('hello world');});app.listen(3000);

To achieve hot loading, like the basic library not allowed in Erlang, we need some basic code control update processes that cannot be hot-updated. In addition, if you re-execute operations such as app. listen, there is no big difference between restarting the Node. js process. Therefore, we need some clever code to isolate the frequently updated business code from the basic code that is not frequently updated.

// App. js basic code var express = require ('express '); var app = express (); var router = require ('. /router. js'); app. use (router); app. listen (3000); // router. js Business Code var express = require ('express '); var router = express. router (); // The loaded middleware can also automatically update the router. use (express. static ('public'); router. get ('/', function (req, res) {res. send ('Hello World') ;}); module. exports = router;

However, unfortunately, after such processing, although the core code is successfully separated, router. js still cannot perform hot updates. First, the service cannot know when to update the module due to the lack of update trigger mechanism. Second, the app. use operation will keep the old router. js module, so even if the module is updated, the request will still be processed by the old module instead of the new module.

We need to make some adjustments to app. js, start file listening as a trigger mechanism, and solve the cache problem of app. use through closures.

// App. jsvar express = require ('express '); var fs = require ('fs'); var app = express (); var router = require ('. /router. js'); app. use (function (req, res, next) {// use the feature of the closure to obtain the latest router object to avoid the app. use cache router object router (req, res, next) ;}); app. listen (3000); // modify the listening file and reload the code fs. watch (require. resolve ('. /router. js'), function () {cleanCache (require. resolve ('. /router. js '); try {router = require ('. /router. js');} catch (ex) {console. error ('module update failed') ;}}); function cleanCache (modulePath) {require. cache [modulePath] = null ;}

If you try to modify router. js, you will find that the hot update of our code has taken shape. New requests will use the latest router. js code. In addition to modifying the returned content of router. js, you can also try to modify the routing function and update it as expected.

Of course, to implement a complete hot update solution, we need to make more improvements based on our own solution. First, we can use the middleware in the app. use declares some middleware that does not require hot updates or does not require repeated execution for each update, but on the router. you can declare some middleware that you want to modify flexibly. Second, file listening should not only listen to route files, but listen to all files that require hot updates. In addition to file listening, you can also use the extension function of the editor to send signals to the Node. js process during storage or access a specific URL to trigger updates.

How to release resources of old modules

To understand how the resources of the old modules are released, you must first understand Node. js memory collection mechanism. This article does not provide detailed descriptions to explain Node. there are a lot of articles and books on the memory recycle mechanism of js, and interested students can expand and read it on their own. In a simple summary, when an object is not referenced by any object, the object will be marked as recycled and the memory will be released during the next GC process.

So our topic is how to make sure that no object remains referenced by the module after the code of the old module is updated. First, let's take the code in how to update the module code section as an example to see what problems will occur if the old module resources are not recycled. To make the results more visible, let's modify code. js.

// code.jsvar array = [];for (var i = 0; i < 10000; i++) { array.push('mem_leak_when_require_cache_clean_test_item_' + i);}module.exports = array;// app.jsfunction cleanCache (module) { var path = require.resolve(module); require.cache[path] = null;}setInterval(function () { var code = require('./code.js'); cleanCache('./code.js');}, 10);

Good ~ We used a very clumsy but effective method to improve the router. js module memory usage, then start main again. after js, you will find a significant increase in memory. In less than a moment, Node. js prompts process out of memory. However, if we observe the Code of app. js and router. js, we do not find any reference to the old module.

With some profile tools such as node-heapdump, we can quickly locate the problem. In module. js, we found that Node. js automatically adds a reference for all modules.

function Module(id, parent) { this.id = id; this.exports = {}; this.parent = parent; if (parent && parent.children) { parent.children.push(this); } this.filename = null; this.loaded = false; this.children = [];}

Therefore, we can adjust the cleanCache function to remove this reference when the module is updated.

// app.jsfunction cleanCache(modulePath) { var module = require.cache[modulePath]; // remove reference in module.parent if (module.parent) {  module.parent.children.splice(module.parent.children.indexOf(module), 1); } require.cache[modulePath] = null;}setInterval(function () { var code = require('./code.js'); cleanCache(require.resolve('./code.js'));}, 10); 

Execute again. This time, the memory will only grow slightly, indicating that the resources occupied by the old modules have been correctly released.

After the new cleanCache function is used, the general usage is no problem, but it is not easy to worry about. In Node. js, in addition to the reference added by the require system, event listening through EventEmitter is also a common function. In addition, EventEmitter is suspected of being referenced by modules. So can EventEmitter correctly release resources? The answer is yes.

// code.jsvar moduleA = require('events').EventEmitter();moduleA.on('whatever', function () {});

When code. after the js module is updated and all references are removed, as long as moduleA is not referenced by other unreleased modules, moduleA will be automatically released, including our internal event listening.

Only one malformed EventEmitter Application Scenario cannot be addressed in this system, that is, code. every time Javascript is executed, it listens to the event of a Global Object. This will cause the global object to be mounted continuously, and Node. js will soon prompt to detect too many event bindings, suspected Memory leakage.

At this point, we can see that as long as the Node in the require system is processed. js automatically adds reference for us. It is not a big problem to recycle resources of old modules, although we cannot achieve fine-grained control like Erlang in implementing the next hot update to scan the remaining old modules, we can use reasonable preventive measures, solve the Problem of releasing old module resources.

In Web applications, another reference problem is that unreleased modules or core modules reference modules that require hot updates, such as apps. use causes the resources of the old module to be released, and the new requests cannot be correctly processed by the new module. The solution to this problem is to control the global variables or the reference exposure portal, and manually update the portal during hot update execution. For example, how to use the new module to process the router encapsulation in the request is an example. Through the control of this entry, we use the router. any reference to other modules in js will be released with the release of the entry.

Another cause of resource release problems is operations such as setInterval, which will keep the object lifecycle unrecoverable. However, we seldom use such technologies in Web applications, so we are not concerned about the solution.

Conclusion

So far, we have solved Node. js Code hot updates in Web applications, but due to Node. js itself lacks an effective scanning mechanism for retaining objects, so it cannot eliminate the issue that resources of old modules cannot be released due to setInterval. This is also because of these limitations. In the current YOG2 framework we provide, we mainly apply this technology to the Development and debugging phase to achieve rapid development through hot updates. Code updates in the production environment still use the restart or PM2 hot reload function to ensure the stability of online services.

Hot update is actually closely related to the framework and business architecture, so this article does not provide a general solution. For reference, we will briefly introduce how we use this technology in the YOG2 framework. Because the YOG2 framework itself supports App splitting of the front and back terminal systems, our update policy is to update code at the App granularity. In addition. watch operations may have compatibility issues, and some alternatives such as fs. watchFile consumes a lot of performance. Therefore, we use the YOG2 testing machine deployment function to upload and deploy new code to inform the framework of App code updates. When the App updates the module cache at the granularity, the routing cache and template cache are updated to complete the update of all code.

If you are using a framework similar to Express or Koa, you only need to modify the master route according to the methods described in this article, so that you can apply this technology well.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.