Nginx Schema-Process model event model

Last Update:2018-07-26 Source: Internet

Author: User

Tags epoll

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

As we all know, nginx performance is high, and nginx performance is inseparable from its architecture. So what is the nginx? Let's start by first identifying the Nginx framework in this section.

After Nginx, the background process contains a master process and several worker processes, which run in the background in a daemon manner on UNIX systems. We can also manually turn off the background mode, let Nginx run in the foreground, and configure the Nginx to cancel the master process, allowing the nginx to run as a single process. Obviously, we will not do so in the production environment, so close the background mode, generally used for debugging, in the following chapters, we will explain in detail how to debug Nginx. So, we can see that nginx is to work in a multiple process, of course, Nginx is also supporting the way of multithreading, but our mainstream approach or the way of multiple processes, but also the default way of Nginx. There are a number of benefits to nginx using multiple processes, so I'll focus on Nginx's multiple-process model.

Just now, Nginx has a master process and multiple worker processes after it starts. The master process is primarily used to manage the worker process, including: receiving signals from outside, sending signals to the worker processes, monitoring the running status of the worker process, and automatically restarting the new worker process when the worker process exits (in exceptional cases). The basic network events are handled in the worker process. Multiple worker processes are equivalent, they compete equally with requests from clients, and processes are independent of each other. A request that can only be handled in a worker process, a worker process that is unlikely to handle requests from other processes. The number of worker processes can be set, generally we will set up with the machine CPU core, which is the reason for the Nginx process model and event processing model is inseparable. The Nginx process model can be represented by the following figure:

After the Nginx is started, what do we do if we want to operate nginx? As we can see from the above, Master manages the worker process, so we just need to communicate with the master process. The master process receives signals from the outside world and then does different things based on the signal. So we need to control the Nginx, just to send a signal to the master process via kill. For example, Kill-hup PID, is to tell Nginx, calmly restart the nginx, we generally use this signal to restart the Nginx, or reload configuration, because it is easy to restart, so the service is not interrupted. What is the master process doing after receiving the HUP signal? First the master process reloads the configuration file after receiving the signal, then starts a new worker process and sends a signal to all the old worker processes telling them they can retire honorably. After the new worker starts, it begins to receive new requests, and when the old worker receives a signal from master, it no longer receives the new request, and exits after all the unhandled requests in the current process have finished processing. Of course, directly to the master process to send a signal, this is the older mode of operation, Nginx after the 0.8 version, introduced a series of command-line parameters, to facilitate our management. For example,./nginx-s reload, which is to reboot the Nginx,./nginx-s stop, is to stop Nginx running. How to do it. Let's take reload, we see, when we execute the command, we start a new nginx process, and the new nginx process, after parsing to the reload parameter, knows that our goal is to control Nginx to reload the configuration file, and it sends a signal to the master process. Then the next move is the same as we send a signal directly to the master process.

Now we know what the Nginx does when we operate nginx, and how the worker process handles the request. As we mentioned earlier, the worker processes are equal, and each process has the same opportunity to handle the request. When we provide 80-port HTTP service, a connection request comes in, each process is likely to handle this connection, how to do it. First, each worker process is fork from the master process, and in the master process, a listen socket (LISTENFD) is created, and then several worker processes are fork. The LISTENFD of all worker processes becomes readable when a new connection arrives, and to ensure that only one process handles the connection, all worker processes register LISTENFD read events prior to the signing of the LISTENFD read event, and the process that grabbed the mutex Accept_mutex. Call accept in the Read event to accept the connection. When a worker process accept this connection, it begins to read the request, parse the request, process the request, produce the data, return it to the client, and finally disconnect, such a complete request. As we can see, a request is handled entirely by the worker process and only in a worker process.

So what's the benefit of Nginx's adoption of this process model? Of course, the benefits will certainly be much. First of all, for each worker process, the independent process, without the need for locking, so the cost of the lock is eliminated, while programming and problem-finding, it will be much easier. Second, the adoption of a separate process can not affect each other, after one process exits, the other processes are still working, the service is not interrupted, and the master process starts the new worker process quickly. Of course, the abnormal exit of the worker process must be a bug in the program, and an exception exit will cause all requests on the current worker to fail, but will not affect all requests, thus reducing the risk. Of course, there are many benefits, we can slowly experience.

It says a lot about the Nginx process model, and next, let's look at how Nginx handles events.

Someone may have to ask, nginx use more than a worker to process requests, each worker has only one main thread, that can handle the number of concurrent is very limited Ah, how many worker can handle how many concurrent, how high concurrency it. No, this is Nginx's genius, Nginx uses asynchronous non-blocking to process requests, that is, Nginx can handle thousands of requests at the same time. Think about the common way Apache works (Apache also has asynchronous non-blocking version, but because it and some of the modules that conflict, so it is not common), each request will monopolize a worker thread, when the concurrency count to thousands of, while thousands of of the thread in the processing request. This is a big challenge for the operating system, the memory footprint of the thread is very large, the CPU overhead of the thread's context switch is very high, the natural performance is not going to go, and these expenses are completely meaningless.

Why is it that nginx can be handled in an asynchronous non-blocking manner, or what is the problem with asynchronous non-blocking? Let's go back to the original point and look at the complete process of a request. First, request to come over, to establish a connection, and then receive data, receive data, then send data. Concrete to the bottom of the system, is read and write events, and when the reading and writing events are not ready, must not operate, if not without blocking the way to call, it will have to block the call, the event is not ready, it can only wait, and so on, and so on, you continue. Blocking calls go into the kernel and wait, CPU will let go out to others use, for the single thread of the worker, obviously not appropriate, when the network events, we are waiting, the CPU idle down no one to use, the CPU utilization of natural, not to talk about high concurrency. Well, you say plus the number of processes, what's the difference between this and the Apache threading model, notice that you don't add unnecessary context switches. So, inside the nginx, the most taboo blocking system is called. Don't block, it's not blocking. Non-blocking is, the event is not ready, immediately return to Eagain, tell you, the incident is not ready, you panic about what, will come again. Well, after a while, check out the event until the event is ready, during which time you can do something else and then look at the event. Although not blocked, but you have to come to check the status of the event, you can do more things, but the cost is not small. Therefore, there will be asynchronous non-blocking event handling mechanism, specific to system call is like select/poll/epoll/kqueue such a system call. They provide a mechanism that allows you to monitor multiple events at the same time, calling them to be blocked, but you can set the timeout time, within the timeout period, and if there are events ready, return. This mechanism just solves two of our problems, take epoll as an example (in the following example, we take epoll as an example to represent this kind of function), when the event is not ready, put into the epoll inside, the event is ready, we go to read and write, when read and write back to Eagain, We added it to the epoll again. So, as soon as the event is ready, we will deal with it and wait in Epoll only if all events are not ready. In this way, we can handle a lot of concurrency concurrently, of course, the concurrent request here, refers to the unhandled request, the thread has only one, so at the same time can handle the request of course only one, but in the request to constantly switch, but also because the asynchronous event is not ready, and the initiative to let go. There is no cost to switching here, and you can understand that there are many prepared events to be recycled, and this is actually the case. Compared to multithreading, this event handling method has great advantages, do not need to create a thread, each request occupies a small amount of memory, no contextSwitching, event handling is very lightweight. No amount of concurrent numbers will cause unnecessary waste of resources (context switching). More concurrent numbers will only consume more memory. I have previously tested the number of connections, the number of concurrent requests processed on 24G memory machines reached 2 million. Now the network server basically uses this way, this is also the main reason that nginx performance is efficient.

As we said before, it is easy to understand that a number of worker numbers are recommended as CPU cores, and that more worker counts will only cause the process to compete for CPU resources, resulting in unnecessary context switching. Furthermore, Nginx provides CPU affinity binding options for better use of multi-core features, we can bind a process to a kernel so that it does not invalidate the cache due to process switching. Such small optimizations are very common in nginx and also illustrate the drew up of nginx authors. For example, when a 4-byte string comparison is made, the Nginx converts 4 characters into an int and compares them to reduce the number of instructions on the CPU.

Now we know why Nginx chose such a process model and an event model. For a basic Web server, events usually have three types, network events, signals, timers. From the above explanation, we know that network events can be solved well by asynchronous non-blocking. How to handle signals and timers.

First, the signal processing. For Nginx, there are certain signals that represent a particular meaning. The signal interrupts the current operation of the program, and continues execution after changing the state. If it is a system call, it may cause a system call to fail and need to be reentrant. On the signal processing, we can learn some professional books, here do not say more. For Nginx, if Nginx is waiting for an event (epoll_wait), if the program receives a signal, after the signal processing function is processed, EPOLL_WAIT returns an error, and the program can then enter the epoll_wait call again.

Also, take a look at the timer. Because functions such as epoll_wait can set a timeout time when called, Nginx uses this timeout to implement the timer. Nginx inside the timer event is placed in a maintenance timer of the red-black tree, each time before entering the epoll_wait, first from the red-black tree to get all the timer event of the minimum time, in the calculation of the Epoll_wait timeout time after entering the epoll_wait. So, when there is no event and no interrupt signal, epoll_wait will time out, that is to say, the timer event arrived. At this point, Nginx checks all timeout events, sets their status to timeout, and then handles network events. As you can see, when we write Nginx code, the first thing we usually do when we're dealing with a callback function for a network event is to judge the timeout and then handle the network event.

We can use a pseudo code to summarize the Nginx event-handling model:

while (true) {to
    T in Run_tasks:
        T.handler ();
    Update_time (&now);
    Timeout = eternity;
    For T in Wait_tasks:/* sorted already/
        if (T.time <= now) {
            t.timeout_handler ();
        } else {
            timeout = T.time-now;
            break;
    Nevents = poll_function (events, timeout);
    For i in nevents:
        task t;
        if (Events[i].type = = READ) {
            t.handler = Read_handler;
        } else {/* Events[i].type = = WRITE/
            T.handler = Write_handler;
        }
        Run_tasks_add (t);
}

OK, in this section we talk about the process model, the event model, including network events, signals, timer events.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More