Alibaba technical article: Talking about Node. js and PHP process management

Source: Internet
Author: User
Tags hhvm
Alibaba technical article: Talking about Node. js and PHP process management, PHP occupies half of the server programming language, just as Wang Feng is in the music circle. As Node. js gradually stepped onto the stage of server-side programming, the debate on the advantages and disadvantages of PHP and Node. js has never been interrupted.

Monopoly market share is enough to prove PHP's excellence. In addition, the innovation of HHVM and PHP 7 has also brought a leap-Over Performance breakthrough to PHP. However, when we talk about performance differences at the language level, we often ignore the weight of the Web model in performance.

From CGI to FastCGI

Early Web services were implemented based on the traditional CGI protocol. Each request sent to the server takes three steps: start process, process request, and end process. as a result, the overhead of system resources (such as memory and CPU) is huge when the access volume increases, this may cause server performance degradation or even service interruption.

Figure 1: Simple CGI process diagram

In the CGI protocol, repeated loading of the parser is the main cause of low performance. If the parser process is long in the memory, it can be executed all the time after it is started, without having to re-fork the process every time. This comes with the FastCGI protocol later.

If FastCGI only does this. the model of js single process and single thread is basically the same: Node. the js process continues to run after it is started. all requests are received and processed by this process. when a request causes an unknown error, the process may exit.

In fact, FastCGI is not that simple. in order to ensure the stability of the service, he was designed as a multi-process scheduling mode:

Figure 2: Nginx + FastCGI execution process

This process can also be described as three steps:

  • First, initialize FastCGI Process Manager and start multiple CGI interpreter sub-processes;
  • Then, when the request arrives at the Web server, the process manager selects and connects a sub-process to send the environment variables and standard input to it, after the processing is complete, the standard output and error information will be returned to the Web server;
  • Eventually, the sub-process closes the connection and continues waiting for the next request to arrive;
From child_process to cluster

Let's look back at the process management method of Node. js.

The single-process and single-thread model of the native Node. js is a slot that is easily sprayed. This mechanism also determines that Node. js naturally only supports single-core CPU and cannot effectively use multi-core resources. Once the process crashes, the whole Web service may collapse.

Figure 3: simple Node. js request model

Like CGI, a single process always faces problems of low reliability and poor stability. when it truly serves the production environment, such vulnerabilities are quite fatal. If the code itself is robust enough, it can avoid errors to some extent, but it also puts forward higher requirements for testing. In reality, we cannot avoid code 100% leakage. some items are easy to write test cases, but some items can only be tested by human eyes.

Fortunately, Node. js provides the child_process module, which allows you to create sub-processes at will through simple fork. If a sub-process is assigned to each CPU, the multi-core exploitation can be achieved perfectly. At the same time, because the child_process module itself inherits from the basic class EventEmitter, event-driven communication between processes is very efficient.

Figure 4: simple Node. js master-worker model)

To simplify the implementation of complex parent-child process models, Node. js then encapsulates the cluster module. whether it is server load balancer, resource recycling, or process daemon, it will help you do everything in silence like a nanny. For detailed technical details, refer to Tao Jie's "What are we talking about when we talk about cluster (top)" and "what are we talking about when we talk about cluster (bottom)", the deduction and implementation of the cluster solution are provided here.

In Node. js, to allow applications to run on multi-core clusters, you only need a few lines of code:

Var cluster = require ('cluster'); var OS = require ('OS'); if (cluster. isMaster) {for (var I = 0, n = OS. cpus (). length; I <n; I ++) {cluster. fork () ;}} else {// start the application ...}

In contrast, how does the FastCGI protocol handle this model?

Inherent defects in PHP-FPM

PHP-FPM is PHP for FastCGI protocol specific implementation, but also PHP in a variety of Server Application Programming port (SAPI: cgi, fast-cgi, cli, isapi, apache) the most common and best-performing process manager. It also implements a parent-child process management model similar to Node. js, ensuring the reliability and high performance of Web services.

PHP-FPM is a typical multi-process synchronization model, meaning that a request corresponds to a process thread, and IO is synchronous blocking. So although the PHP-FPM maintains an independent CGI process pool, the system can easily manage the lifecycle of the process, but destined to be unable to like Node. in this way, a process can handle huge request pressure.

Depending on the hardware of the server, the PHP-FPM needs to specify a reasonable php-fpm.conf configuration:

Pm. max_children # maximum number of child processes pm. start_servers # Number of child processes at startup pm. min_spare_servers # minimum number of idle processes. when idle processes are insufficient, pm is automatically added. max_spare_servers # maximum number of idle processes. The pm is automatically cleared when the number of idle processes exceeds. max_requests = 1000 # threshold of the number of sub-process requests, which are automatically reclaimed after the threshold is exceeded

Unlike JavaScript, the PHP process does not have memory leakage. after processing the request, each process recycles the memory, but does not release it to the operating system, this causes a large amount of memory to be occupied by the PHP-FPM and cannot be released, the performance decreases sharply when the request volume increases.

Therefore, the PHP-FPM needs to control the threshold of the number of requests of a single sub-process. Many may mistakenly think that max_requests controls the number of concurrent connections of the process, in fact, the process in PHP-FPM mode is a single thread, the request cannot be concurrent. The real significance of this parameter is to provide the request counter function. after the threshold value is exceeded, it is automatically reclaimed to relieve the memory pressure.

Perhaps you have discovered the key to the problem: despite the superior PHP-FPM architecture, it is stuck in the performance of a single process.

Node. js is born without this problem, but PHP-FPM is not guaranteed, its stability is subject to the fit of hardware facilities and configuration files, as well as Web servers (usually Nginx) load scheduling capability for PHP-FPM services.

ReactPHP, event-driven, asynchronous execution, non-blocking IO

The enthusiasm for PHP 7 masks the violent impact of Node. js. When everyone is still immersed in how to choose HHVM or PHP 7, ReactPHP is also thriving. it completely abandons the traditional nginx + php-fpm architecture, instead imitating and accepting Node. js event-driven and non-blocking I/O models, even subtitles, are all the same:

Event-driven, non-blocking I/O with PHP.

Given that everyone is familiar with Node. js, we will not repeat the principles of ReactPHP. we can think of it as a PHP version of Node. js. Take it and the traditional architecture (Nginx + PHP-FPM, fairness, PHP-FPM only open a process) to compare, the result is like this:

Figure 5: QPS curve when "Hello World" is output

Figure 6: QPS curve when Querying SQL

We can see that when the event-driven, asynchronous execution, non-blocking IO is transplanted to the PHP, even without PHP-FPM support, QPS curve is still good, in IO-intensive scenarios, the performance is even doubled.

The event and asynchronous callback mechanisms are awesome. it cleverly resolves congestion during large-scale concurrency and high throughput into an asynchronous event queue, and then solves congestion one by one (such as file reading, database query ).

The single-process model may be somewhat challenging. However, the obvious fact is that the reliability of a single process model has a lot of room for Optimization at the Web server and process manager levels, and the processing capability of high concurrency depends on the language characteristics, to put it bluntly, it is the support of events and asynchronization.

These two points must make the Node. javaScript is naturally proud, but it is not supported by the native in PHP. it can only be supported by simulating step-by-step operations similar to Node. js event mechanism, so ReactPHP is actually not as perfect as imagined.

Conclusion

Most of the time, when we compare the advantages and disadvantages of a language, it is easy to limit ourselves to the language itself, while ignoring some of the key factors.

Taking PHP for example, I have heard too many topics over the past two years, such as JIT, opcode caching, abstract syntax tree (AST), and HHVM. When these optimizations are gradually complete and the language issues are no longer short Web performance issues. If it doesn't work, we can also hand over the complex tasks to C and C ++, in the form of Node. js addon or PHP extension, easily done.

PHP is the "best language in the world", so it is time to learn Node. js event-driven and asynchronous callback, consider how to make a bold innovation in PHP-FPM. After all, whether it is Node. js or PHP, what we are good at will be Web and high-performance Web.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.