Why use node. js

Last Update:2016-08-25 Source: Internet

Author: User

Tags emit nginx load balancing

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is a learning note for mobile engineers involved in front-end and back-end development, where errors or understandings are not in place.

What is node. js

The traditional JavaScript runs on the browser because the browser core is actually divided into two parts: the rendering engine and the JavaScript engine. The former is responsible for rendering HTML + CSS, while the latter is responsible for running JavaScript. The JavaScript engine used by Chrome is V8, and it's very fast.

node. JS is a framework that runs on the server and uses the V8 engine at the bottom. We know that Apache + PHP and Java servlets can be used to develop dynamic Web pages, and node. JS is similar to them, but is developed using JavaScript.

After you've covered the definitions, for a simple example, create a new App.js file and enter the following:

var  http = require  ( ' http ' ); Http.createserver ( function   (request, Response)  { response.writehead (200 , {:  ' Text/plain ' }); //HTTP Response header  response.end ( ' Hello world\n '  ); //returns data "Hello World" }). Listen (8888 ); //listening 8888 Port  //terminal prints the following information  Console.log ( ' Server running at http:// 127.0.0.1:8888/' );

In this way, a simple HTTP Server, even if it is finished, the input node app.js can be run, and then access will see the output results.

Why use node. js

Face a new technology, ask a few why is always good. Since PHP, Python, and Java can all be used for back-end development, why do you want to learn about node. js? At least we should know what scenario it would be better to choose node. js.

In general, node. JS is suitable for the following scenarios:

Real-time applications, such as online multiplayer collaboration tools, web chat applications, and more.
High concurrency applications that are primarily I/O, such as providing APIs to clients and reading databases.
Streaming applications, such as clients often upload files.
Front and back end separation.

In fact, the first two can be summed up as one, that is, the client widely uses long connections, although the number of concurrent is higher, but most of them are idle connections.

node. JS has its limitations, it is not suitable for CPU-intensive tasks, such as artificial intelligence calculations, video, image processing, and so on.

Of course, the above shortcomings are not nonsense, or rote, but not the same, we need to have a certain understanding of node. js principle, in order to make the right judgment.

Basic concepts

Before introducing node. JS, figuring out some basic concepts will help you understand node. js more deeply.

Concurrent

Unlike clients, one of the data that the server developer is very concerned about is the number of concurrent requests, which is the maximum number of clients a client can support. The c10k problem of the early years was to discuss how to use a single server to support 10K concurrency. Of course, with the improvement of hardware and software performance, the current c10k is no longer a problem, we began to try to solve the c10m problem, that is, how a single server handles millions concurrency.

At c10k, we're still using the Apache server, which works by making a sub-process and running PHP scripts in the subprocess whenever a network request arrives. After executing the script, send the results back to the client.

This ensures that different processes do not interfere with each other, even if a process problem does not affect the entire server, but the disadvantage is also obvious: the process is a relatively heavy concept, with its own heap and stack, occupy more memory, a server can run a maximum number of processes, about thousands of or so.

Although Apache later used FastCGI, it was essentially a process pool that reduced the overhead of the creation process, but did not effectively increase the number of concurrent numbers.

The Java servlet uses the thread pool, where each servlet runs on a thread. threads, although lighter than processes, are also relative. It has been tested that the size of the stack that is exclusive to each thread is 1M and is still not efficient enough. In addition, multi-threaded programming can cause a variety of problems, which presumably programmers have deep experience.

If you do not use threads, there are two other solutions, using both the coroutine and non-blocking I/O. The turndown thread is lighter, multiple threads can run in the same thread, and the programmer is responsible for scheduling it, a technique that is widely used in the Go language. The non-blocking I/O is used by node. js to handle high concurrency scenarios.

Non-blocking I/O

The I/O mentioned here can be divided into two types: network I/O and file I/O, which are actually highly similar. I/O can be divided into two steps, first copy the contents of the file (network) into the buffer, which is in the operating system exclusive memory area. The contents of the buffer are then copied to the memory area of the user program.

For blocking I/O, both steps are blocked, from initiating a read request to buffer readiness to getting data from the user process.

Non-blocking I/O is actually polling the kernel, the buffer is ready, and if not, continue with other operations. When the buffer is ready, the buffer content is copied to the user process, which is actually blocked.

I/O multiplexing technology refers to the use of a single thread to handle multiple network I/O, what we often say select epoll is the function to poll all sockets. For example, Apache uses the former, and Nginx and node. js use the latter, except that the latter is more efficient. Because I/O multiplexing is actually a single-threaded poll, it is also a non-blocking I/O scenario.

asynchronous I/O is the ideal I/O model, but unfortunately true asynchronous I/O does not exist. AIO on Linux transmits data through signals and callbacks, but there are flaws. The existing Libeio, and the IOCP on Windows, essentially simulate asynchronous I/O using the thread pool and blocking I/O.

node. JS Threading Model

Many articles mention that node. js is single-threaded, but that's not rigorous or even irresponsible, because we have at least one of the following questions to think about:

How does node. JS handle concurrent requests in a thread?
How does node. js make asynchronous I/O to a file in a thread?
How does node. JS reuse the processing power of multiple CPUs on the server?

Network I/O

node. js can actually handle a large number of concurrent requests in a single thread, but this requires some programming skills. We review the code at the beginning of the article, after executing the app.js file, the console will have output immediately, and we will see "Hello,world" when we visit the Web page.

This is because node. JS is event-driven, which means its callback function executes only when the network request event occurs. When more than one request arrives, they are queued up for execution.

This may seem natural, however, if you do not realize that node. JS is running on a single thread, and that the callback function is synchronous, while developing the program according to the traditional pattern, it can cause serious problems. As a simple example, the "Hello World" string here may be the result of some other module running. Assuming that the generation of "Hello World" is time-consuming, it blocks the callback for the current network request and causes the next network request to be unresponsive.

The workaround is simple, with an asynchronous callback mechanism. We can pass the parameters used to produce the output to response other modules, generate the output asynchronously, and finally execute the actual output in the callback function. The benefit is that http.createServer the callback function does not block, so there is no case for the request to be unresponsive.

For example, let's change the portal of the server, in fact, if you want to complete the route yourself, about this idea:

varrequire(‘http‘);varrequire(‘./string‘// 一个第三方模块http.createServer(function (request, response) {    // 调用第三方模块进行输出}).listen(8888);

Third-party modules:

function  sleep   (milliSeconds)  { //analog stutter  var  startTime = new  date  (). GetTime (); while  (new  date  (). GetTime () < StartTime + MilliSeconds);} function  outputString   { sleep (10000 ); //block 10s  response.end ( ' Hello world\n ' ); //perform time-consuming operations before outputting } Exports.output = outputstring;

In summary, when programming with node. JS, any time-consuming operation must be done using async to avoid blocking the current function. Because you're serving the client, all the code is always single-threaded, sequential execution.

If the beginner is still unable to understand this, it is recommended to read the book "Nodejs" or read the section on event loops below.

File I/O

I also emphasized in previous articles that Async is designed to optimize the experience and avoid stalling. And really save processing time, the use of CPU multi-core performance, or to rely on multithreading parallel processing.

In fact, node. JS maintains a thread pool at the bottom. Previously mentioned in the Basic Concepts section, there is no real asynchronous file I/O, usually through a line pool simulation. The thread pool has a default of four threads for file I/O.

It is important to note that we cannot directly manipulate the underlying thread pool, nor do we really need to be concerned about their existence. The role of the thread pool is simply to complete I/O operations, rather than to perform CPU-intensive operations, imaging, video processing, mass computing, and so on.

If there are a few CPU-intensive tasks that need to be handled, we can start multiple node. JS processes and use the IPC mechanism for interprocess communication, or call an external C++/java program. If you have a lot of CPU-intensive tasks, it can only be explained that choosing node. JS is a bad decision.

Squeeze Dry CPU

So far, we've learned that node. js uses I/O multiplexing technology, using single-threaded network I/O to simulate asynchronous file I/O using a thread pool and a small number of threads. Does node. JS's single thread look like a chicken on a 32-core CPU?

The answer is no, we can start multiple node. js processes. Unlike the previous section, there is no need for communication between processes, they listen to one port at a time, and use Nginx to load balance at the outermost layer.

Nginx load balancing is easy to implement, as long as you edit the configuration file:

http{    upstream sampleapp {        // 可选配置项，如 least_conn，ip_hash        127.0.0.1:3000;        127.0.0.1:3001;        // ... 监听更多端口    }    ....    server{       80;       ...       location / {          proxy_pass http://sampleapp; // 监听 80 端口，然后转发       }     }

The default load balancing rule is to assign network requests to different ports, and we can use least_conn flags to forward network requests to the node. JS process with the least number of connections, or to ip_hash ensure that the same IP request must be handled by the same node. JS process.

Multiple node. JS processes can take full advantage of the processing power of multicore CPUs and have a strong ability to expand.

Event Loops

There is an event loop in node. js, and students with IOS development experience may feel familiar. Yes, it is similar to Runloop to some extent.

A complete Event Loop can also be divided into multiple stages (phase), followed by poll, check, close callbacks, timers, I/O callbacks, and Idle.

Because node. JS is event-driven, each event's callback function is registered to the different stages of event Loop. For example, the callback function is added to the I/O callbacks, and the callback is added to the end of the fs.readFile setImmediate next poll phase of the Loop, and process.nextTick() the callback is added to the end of the current phase, before the next phase begins.

It is important to know that callbacks for different asynchronous methods will be executed at different phase, otherwise there will be a logic error due to the order of the call.

The loop of Event loops, in which all callback functions registered at that stage are synchronized in each phase. This is why I mentioned in the network I/O section, do not call the blocking method in the callback function, always use the asynchronous thought to carry on the time-consuming operation. A callback function that takes too long may cause the Event Loop to be in a certain phase for a long time and the new network request cannot be responded to in time.

Because the purpose of this article is to have a preliminary, comprehensive understanding of node. js. Each stage of the Event Loop is not covered in detail, and details can be viewed in the official documentation.

You can see that the event Loop is still relatively low-level, and node. JS encapsulates this class for the convenience of using the events-driven idea EventEmitter :

varEventemitter =require(' Events ');varUtil =require(' util '); function mything() {Eventemitter.call ( This); Setimmediate ( function (self) {Self.emit (' thing1 '); }, This); Process.nexttick ( function (self) {Self.emit (' Thing2 '); }, This);} Util.inherits (mything, Eventemitter);varMT =NewMything (); Mt.on (' thing1 ', function onThing1() {Console.log ("Thing1 emitted");}); Mt.on (' Thing2 ', function onThing1() {Console.log ("Thing2 emitted");});

According to the output, self.emit(thing2) although the later definition, but first executed, this also fully conforms to the call rules of the Event Loop.

Many of the modules in node. js are inherited from Eventemitter, as mentioned in the next section fs.readStream , which is used to create a readable stream of files, open a file, read the data, and throw a corresponding event when the read is complete.

Data flow

The benefits of using data flow are obvious, and there is a true portrayal of life. For example, the teacher assigned the summer homework, if the students do a little every day (job flow), it can be more easily completed tasks. If the backlog together, to the last day, the face of the pile into the hills of the exercise, will feel powerless.

This is also true for Server development, assuming that a user uploads a 1G file or reads a local 1G file. Without the concept of data flow, we need to open a buffer of 1G size and then centralize it once the buffer is full.

In the case of data flow, we can define a very small buffer, such as the size is 1Mb. When the buffer is full, a callback function is executed to handle this small piece of data, thus avoiding backlogs.

In fact request , fs the file read of the module is a readable stream of data:

var  fs = require  ( ' FS ' ); var  readablestream = Fs.createreadstream ( ' file.txt ' ); var  data =  "; Readablestream.setencoding ( ' UTF8 ' ); //each buffer full, processing a small piece of data chunk  Readablestream.on ( ' data ' , function   (chunk )  { data+=chunk;}); //file stream full read complete  readablestream.on ( ' end ' , function   ()  { Console.log (data);});

With pipeline technology, you can write the contents of one stream into another stream:

varrequire(‘fs‘);var readableStream = fs.createReadStream(‘file1.txt‘);var writableStream = fs.createWriteStream(‘file2.txt‘);readableStream.pipe(writableStream);

Different streams can also be concatenated (Chain), such as reading a compressed file, extracting one side of the extract, and writing the extracted content to a file:

varrequire(‘fs‘);varrequire(‘zlib‘);fs.createReadStream(‘input.txt.gz‘)  .pipe(zlib.createGunzip())  .pipe(fs.createWriteStream(‘output.txt‘));

node. JS provides a very concise data flow operation, and the above is a simple introduction to usage.

Summarize

For high-concurrency long connections, the event-driven model is much lighter than threads, and multiple node. JS processes can be easily extended with load balancing. So node. JS is ideal for servicing I/O intensive applications. But the flaw in this approach is not being good at CPU-intensive tasks.

Node. JS typically describes the data as a stream, and it provides a good encapsulation.

node. JS is developed using the front-end language (JavaScript) and is a back-end server, thus providing a good idea for both front-and back-end separation. I'll analyze this in the next article.

Resources

Concurrent Tasks on node. js
Add load Balancing to Nodejs with Nginx
Understanding the node. js Event Loop
The node. JS Event Loop
The Basics of node. JS Streams

Why use node. js

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More