Node.js Clever implementation of the Web application code hot update _node.js

Source: Internet
Author: User
Tags closure event listener require setinterval

Background

It is believed that students who have developed WEB applications using node.js must be annoyed that the newly modified code must restart the Node.js process before updating the issue. The habit of using PHP to develop students will be very not applicable, big call is still my big PHP is the world's best programming language. Manual Restart process is not only very annoying duplication of work, when the application scale slightly larger, start time also gradually began to not be overlooked.

Of course, as a program ape, no matter which language is used, it will not let such things torment themselves. The most direct and pervasive way to solve such problems is to listen to file modifications and restart the process. This method also has a number of mature solutions available, such as the node-supervisor has been abandoned pits, and now the fire of the PM2, or more lightweight node-dev and so on are such ideas.

This article provides another way of thinking, only a small transformation, you can implement the real 0 Restart hot update code to solve the node.js development of WEB Applications annoying code update problems.

General Ideas

Talking about code hot updates, the most famous of which is the hot update feature of Erlang, the language features high concurrency and distributed programming, and the main scenarios are similar to securities trading, gaming services and other fields. These scenarios are more or less demanding that the service have the means to operate in the runtime, and code hot updates are a very important part of it, so we can get a brief look at Erlang's approach.

Since I have not used Erlang, the following are hearsay, and if you want to get an in-depth and accurate understanding of the Code Thermal update implementation for Erlang, it is best to consult the official documentation.

Erlang's code load is managed by a module named Code_server, and most of the code is loaded by Code_server in addition to some of the necessary code at boot time.
When Code_server discovers that the module code is updated, the module is reloaded, and subsequent requests are executed with the new module, while the original still executing request continues using the old module.
The old module will be tagged with an older label after the new module is loaded, and the new module is the current tab. In the next hot update, Erlang scans the old module and kills it, and continues to update the module with this logic.
Not all code in Erlang allows hot updates, such as kernel, stdlib, compiler, and so on, which are not allowed to be updated by default
We can see that there are modules similar to Code_server in Node.js, the Require system, so Erlang should also be able to do some experiments on node.js. By understanding Erlang's approach, we can probably sum up the key issues in solving code thermal updates in Node.js.

How to update module code
How to use a new module to process requests
How to release resources from old modules

So then we're going to parse these problem points one by one.

How to update module code

To solve the problem of module code update, we need to read Node.js module manager implementation, directly on the link module.js. By simply reading, we can find that the core of the code is module._load, a little bit of streamlining the code posted.

Check the cache for the requested file.
1. If A module already exists in the Cache:return it exports object.
2. If the module is Native:call ' Nativemodule.require () ' with the
//filename and return to the result.
3. Otherwise, create a new module for the "file" and save it to the cache.
Then have it load the file contents before returning its exports
/object.
Module._load = function (request, parent, Ismain) {
 var filename = module._resolvefilename (request, parent);

 var cachedmodule = Module._cache[filename];
 if (cachedmodule) {return
 cachedmodule.exports;
 }

 var module = new module (filename, parent);
 Module._cache[filename] = module;
 Module.load (filename);

 return module.exports;
};

Require.cache = Module._cache;

You can see that the core is module._cache, as long as you clear the module cache, the next time require, the module manager will reload the latest code.

Write a small program to verify

Main.js
function Cleancache (module) {
 var path = require.resolve (module);
 Require.cache[path] = null;
}

SetInterval (function () {
 cleancache ('./code.js ');
 var code = require ('./code.js ');
 Console.log (code);
}, 5000);
Code.js
module.exports = ' Hello World ';

We perform a main.js and change the contents of the Code.js, we can find the console, our code successfully updated to the latest code.

So the problem with the module manager updating the code has been solved, and then look at how we can make the new modules actually execute in the WEB application.

How to use a new module to process requests

In order to be more consistent with the use of our habits, we directly to Express as an example to expand the problem, in fact, using similar ideas, most Web applications can be applied.

First of all, if our service is like the Express DEMO, all the code is in the same module, we are unable to the module for thermal loading

var express = require (' Express ');
var app = Express ();

App.get ('/', function (req, res) {
 res.send (' Hello World ');
};

App.listen (3000);

To achieve thermal loading, as in Erlang, we need some basic code control update processes that cannot be hot updated. And such operations like App.listen, if performed again, are not much different from restarting the node.js process. So we need some clever code to isolate frequently updated business code from the less frequently updated base code.

App.js Base code
var Express = require (' Express ');
var app = Express ();
var router = require ('./router.js ');

App.use (router);

App.listen (3000);
Router.js business code
var Express = require (' Express ');
var router = Express. Router ();

The middleware loaded here can also automatically update
router.use (express.static (' public '));

Router.get ('/', function (req, res) {
 res.send (' Hello World ');
};

Module.exports = router;

However, unfortunately, after this processing, although successfully separated the core code, Router.js still cannot do hot update. First, because of the lack of triggering mechanisms for updates, the service does not know when to update the module. Second, the app.use operation will keep the old router.js module, so even if the module is updated, the request will still use the old module instead of the new module.

So continue to improve, we need to adjust the app.js slightly, start file monitoring as a trigger mechanism, and through the closure to solve the app.use cache problem

App.js
var express = require (' Express ');
var fs = require (' FS ');
var app = Express ();

var router = require ('./router.js ');

App.use (function (req, res, next) {
 //Get the latest router object using the closure feature to avoid App.use cache router Object
 router (req, res, next);

App.listen (3000);

Listen for file modification reload code
fs.watch (Require.resolve ('./router.js '), function () {
 Cleancache (require.resolve ('./ Router.js '));
 try {
  router = require ('./router.js ');
 } catch (ex) {
  console.error (' Module update failed ');
 }
);

function Cleancache (modulepath) {
 Require.cache[modulepath] = null;
}

Try to modify the Router.js will find that our code hot update has been embryonic, the new request will use the latest Router.js code. In addition to modifying the return content of the router.js, you can also try to modify the routing function and update it as expected.

Of course, to achieve a complete thermal update solution requires more integration with its own solutions to make some improvements. First, in the use of middleware, we can declare some middleware that does not require hot update or app.use every update, and in Router.use you can declare some middleware that you want to modify flexibly. Second, file sniffing cannot listen to routing files alone, but instead listens to all files that require hot updates. In addition to the means of file monitoring, you can also combine the editor's extended function to send a signal to the node.js process at save time or to access a specific URL to trigger the update.

How to release resources from old modules

To explain how the resources of the old module to release the problem, in fact, need to understand the node.js memory recovery mechanism, this article is not prepared to elaborate, explain the node.js memory recycling mechanism of articles and books a lot of interested students can expand their own reading. The simple conclusion is that when an object is not referenced by any object, the object is marked as recyclable and the memory is freed at the next GC processing.

So our topic is how to make sure that no object keeps a reference to the module after the code of the old module is updated. First, let's take a look at how to update the code in the section of the module code as an example to see what happens to old module resources without recycling. To make the results more significant, let's revise the Code.js

Code.js
var array = [];

for (var i = 0; I < 10000 i++) {
 array.push (' mem_leak_when_require_cache_clean_test_item_ ' + i);
}

Module.exports = array;
App.js
function Cleancache (module) {
 var path = require.resolve (module);
 Require.cache[path] = null;
}

SetInterval (function () {
 var code = require ('./code.js ');
 Cleancache ('./code.js ');
}, 10);

OK ~ We used a very clumsy but effective way to improve the memory footprint of the Router.js module, then start the main.js again, you will find that memory has a significant surge, less than a few node.js prompts the process out of the memory. However, in fact, from the App.js and Router.js code, we did not find where to save the old module reference.

We use some profile tools such as node-heapdump to quickly locate the problem, and in module.js we find that Node.js automatically adds a reference to all modules

function Module (ID, parent) {
 this.id = ID;
 This.exports = {};
 This.parent = parent;
 if (parent && parent.children) {
 Parent.children.push (this);
 }

 This.filename = null;
 this.loaded = false;
 This.children = [];
}

Accordingly, we can adjust the Cleancache function and remove the reference when the module is updated.

App.js
function Cleancache (modulepath) {
 var module = Require.cache[modulepath];
 Remove reference in Module.parent
 if (module.parent) {
  Module.parent.children.splice ( Module.parent.children.indexOf (module), 1);
 }
 Require.cache[modulepath] = null;
}

SetInterval (function () {
 var code = require ('./code.js ');
 Cleancache (Require.resolve ('./code.js '));
}, 10; 

To perform again, this time much better, memory will only have a slight increase, indicating that the old module occupies the resources have been correctly released.

With the new Cleancache function, there's no problem with regular usage, but it's not easy to sit back and relax. In Node.js, in addition to the require system will add references, through the Eventemitter event monitoring is also commonly used functions, and eventemitter have a very large suspicion will appear between the modules of the reference. So eventemitter can properly release the resources? The answer is yes.

Code.js
var Modulea = require (' Events '). Eventemitter ();

Modulea.on (' Whatever ', function () {
});

When the Code.js module is updated and all references are removed, the Modulea is automatically released as long as the Modulea is not referenced by other not-released modules, including our internal event listener.

Only a malformed Eventemitter application scenario in this system can not handle, that is, each time the code.js will listen to the event of a global object, which will cause the global object on the continuous Mount event, and Node.js will soon be prompted to detect too many event bindings, Suspected memory leak.

At this point, you can see that as long as the require system to deal with the node.js for us to automatically add the reference, the old module of resource recovery is not a big problem, although we can not do like Erlang to achieve the next hot update on the remaining old modules to scan such fine-grained control, However, we can solve the problem of releasing the old module resources by means of reasonable evasion.

In the Web application, there is also a reference problem is not released modules or core modules for hot update modules have references, such as app.use, resulting in the old module resources can not be released, and the new request can not correctly use the new module for processing. The solution to this problem is to control global variables or referenced exposed portals and manually update the portal during hot update execution. How to use the new module to process the encapsulation of the router in the request is an example, through the control of this entry, we in the router.js in any case reference other modules, will be released with the release of the portal.

Another cause of resource release problems is that operations like setinterval keep the lifecycle of an object from being freed, but we rarely use such techniques in WEB applications, so there is no concern in the scenario.

End

At this point, we have solved the node.js in the Web application of code Hot update of the three major issues, but because Node.js itself lacks the scanning mechanism of effective retention objects, so it is not 100% to eliminate the same setinterval caused by the old module of the problem of the release of resources. Because of this limitation, we provide the YOG2 framework, which is mainly used in the development of debugging period, through hot update to achieve rapid development. However, the code update of the production environment still uses the hot reload function of restarting or PM2 to ensure the stability of the online service.

Because hot updates are actually closely related to the framework and business architecture, this article does not give a common solution. As a reference, briefly describe how we use this technique in the YOG2 framework. Since the YOG2 framework itself supports app split of the front and rear terminals, our update strategy is to update the code with app granularity. At the same time, because such operations like fs.watch have compatibility problems, some alternatives such as fs.watchfile are more cost-consuming, so we combined the YOG2 test-machine deployment to tell the framework to update the APP code in the form of a new code upload. While updating the module cache with App granularity, the routing cache and the template cache are updated to complete all code updates.

If you are using a kind of framework such as Express or Koa, just follow the method in the article and combine with your own business needs, the main route of the transformation, you can very good application of this technology.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.