Translated from: http://kb.cnblogs.com/page/523446/
English Original: Inside Nginx:how We designed for performance & scale
To understand the design better, you need to understand how nginx works. The reason why Nginx is so superior in performance is due to the design behind it. Many Web servers and application servers use simple threading (threaded), or process-based (process-based) architectures, and Nginx stands out as a complex event-driven (Event-driven) architecture. This architecture can support thousands of concurrent connections on modern hardware.
Inside Nginx infographic involves the excavation from the high-level process architecture to the Nginx single-process processing multi-Connection diagram. This article explains the details of the work.
Setting the scene--nginx process model
To understand the design better, you need to understand how nginx works. Nginx has a master process (which performs privileged operations, such as read configuration, bound ports) and a series of worker processes (worker process) and worker processes.
Within this quad-core server, the Nginx master process creates 4 worker processes and 2 cache worker processes (Cachehelper processes) to manage the disk content cache (On-disk contents caches).
Why is architecture important?
The fundamental foundation of any UNIX application is a thread or process (from the Linux operating system perspective, threads and processes are essentially the same, the main difference being the extent to which they share memory). A process or thread is a set of independent instructions that the operating system can dispatch and run on the CPU core. Most complex applications run multiple threads or processes in parallel for two of reasons:
-
You can use more computer cores at the same time.
-
Threads and processes make it easy to implement parallel operations (for example, working with multiple connections at the same time).
Both processes and threads consume resources. They all use memory and other OS resources, causing the kernel to switch frequently (operations called context switches). Most modern servers can handle hundreds of small, active threads or processes at the same time, but when memory runs out, or high I/O loads cause a large number of context switches, the performance of the server is severely degraded. For network applications, a thread or process is typically assigned to each connection (connection). This architecture is easy to implement, but the extensibility of this architecture can be problematic when applications need to handle thousands of concurrent connections.
How does Nginx work?
Nginx uses a predictable (predictable) process model to schedule the available hardware resources:
1. The main process performs privileged operations, such as reading configuration and binding ports, and is also responsible for creating child processes (the following three types).
2. The cache loader process runs at startup, loads the disk-based cache (disk-based cache) into memory, and then exits. Its scheduling is very cautious, so its resource requirements are very low.
3. The cache manager process runs periodically and cuts the disk cache (prunes entries from the disk caches) to keep it within the configuration range.
4. Worker processes is the process of performing all actual tasks: Processing network connections, reading and writing content to disk, communicating with upstream servers, and so on.
In most cases, Nginx recommends running 1 working processes per 1 CPU cores, making the most efficient use of hardware resources. You can set the following command in the configuration: worker_processes Auto, when the Nginx server is running, only the worker process is busy. Each worker process handles multiple connections in a non-blocking manner to reduce the overhead of context switching. Each worker process is single-threaded and runs independently, fetching and processing new connections. Shared memory between processes to share cached data, session persistence data (session persistence), and other shared resources.
The work process inside nginx
Each nginx work process is initialized with Nginx configuration, and a set of listening sockets (Listen sockets) is set by the main process.
The Nginx worker process listens to events on the socket (Accept_mutex and kernel socketsharding) to determine when to start working. The event is initialized by a new connection. These connections are assigned to state machines (StateMachine)--http state machines are most commonly used, but Nginx also implements state machines for streaming (native TCP) and a large number of mail protocols (Smtp,imap and POP3).
A state machine is essentially a set of instructions that tells Nginx how to handle a request. Most Web servers that have the same functionality as Nginx also use a similar state machine-just to make the difference.
Scheduling state Machine
Think of the state machine as a chess rule. Each HTTP transaction (HTTP transaction) is a chess game. On one side of the board is a Web server-a master player who can make quick decisions. The other side is the remote client--a Web browser that accesses a site or application on a relatively slow network. However, the rules of the game can be complicated. For example, a Web server might need to communicate with the parties (an upstream application) or with an authentication server. Third-party modules of Web servers can also expand the rules of the game.
Blocking state machine
Recall our previous description of the process and thread: a set of independent instructions that the operating system can dispatch, running on the CPU core. Most Web servers and Web applications use a connection/a process or a connection/a threading model for this chess game. Each process or thread contains a command to play the game to the last. In this process, the process is run by the server, which spends most of its time on "blocking (blocked)", waiting for the client to complete its next action.
1. Web server processes (Web server process) listen for new connections on a listening socket (a new game initiated by the client).
2. After the launch of a new game, the process began to work, each move after the completion of a blocking state, waiting for the client to move the next move.
3. Once the game is over, the Web server process will see if the customer wants to start a new game (this is equivalent to a surviving connection). If the connection is closed (the client leaves or times out), the Web server process returns to the listening state and waits for a new race.
Remember the important point: every active HTTP connection (chess match per game) requires a dedicated process or thread (a master player). This architecture is very easy to extend a third-party module ("New rule"). However, there is a huge imbalance: a lightweight HTTP connection, represented as a file descriptor (descriptor) and a small amount of memory, is mapped to a separate process or thread-they are very heavyweight operating system objects. This is convenient for programming, but it creates a huge waste.
Nginx is a true master.
NGINX is a True grandmaster
Perhaps you have heard of the wheel exhibition, a chess master in the game to deal with dozens of opponents at the same time.
Kiril Georgiev in the Bulgarian capital of Sofia at the same time against chess players, and finally achieved 284 wins, 70 flat, 6 negative record.
This is how the Nginx work process plays "chess". Each work process is a master (remember: Typically, each worker process consumes a CPU core) and can play against hundreds of players (literally thousands).
1. The worker process waits for an event on the listener socket and on the connection socket.
2. The event occurs on the socket, and the worker process handles these events.
-
Listening to events on sockets means that the client has started a new game. The worker process creates a new connection socket.
-
Connecting an event on a socket means that the client has moved the pawn. The worker process responds quickly.
The worker process never stops on the network, and it waits for its "opponent" (client) to respond at all times. When it has moved the game's pieces, it will immediately deal with the next game, or meet a new opponent.
Why is it faster than a blocking multi-process architecture?
Nginx's scale can support tens of thousands of connections per worker process well. Each new connection creates another file descriptor and consumes a small amount of extra memory in the worker process. The additional consumption of each connection is minimal. Nginx processes can maintain a fixed CPU usage. There is less context switching when there is no work.
In a blocking, one-connection/one-process pattern, each connection requires a significant amount of additional resources and overhead, and context switching (from one process to another) is very frequent.
If you would like to know more, see the article on Nginx architecture written by Andrew Alexeev, vice president of development and co-founder of Nginx Corporation.
With proper system tuning, Nginx can handle hundreds of thousands of concurrent HTTP connections on a large scale for each worker process, and can not lose any information during peak traffic (new tournament start).
Configuring Updates and Nginx Upgrades
The Nginx process architecture, which contains only a small number of worker processes, makes it very efficient to update the configuration, even the binaries themselves.
Updating Nginx configuration is a very simple, lightweight, and reliable operation. Run the Nginx–s reload command, which checks the configuration on the disk and sends a SIGHUP signal to the main process.
When the main process receives the SIGHUP signal, it does two things:
1. Reload the configuration and fork a new set of work processes. These new worker processes will immediately begin accepting connections and processing traffic (traffic) (using the new configuration).
2. Send a signal to notify the old worker process to exit silently. These old processes are no longer accepting new connections. As long as the HTTP requests they process are finished, they cleanly shut down the connection. Once all the connections have been closed, the worker process exits.
This process leads to a small spike in CPU utilization and memory usage, but this small spike is negligible compared to loading resources from active connections. You can reload the configuration multiple times within one second. In rare cases, a generation-by-generation worker process waits for a connection to close, but even if a problem occurs, they are immediately resolved.
Nginx's binary upgrade process is more magical-you can quickly upgrade Nginx itself, the server will not have any loss of connectivity, downtime, or service interruption and other situations.
The binary upgrade process is similar to configuration updates. The new Nginx master process is parallel to the original main process, and they share the listening sockets. Two processes are active (active), and their respective worker processes handle their own traffic (traffic). You can then notify the old master process and its working process to exit perfectly.
In controlling Nginx, the whole process is described in more detail.
Conclusion
Nginx's internal chart highly outlines how Nginx works, but behind this simple explanation is more than 10 years of innovation and optimization. These innovations and optimizations enable Nginx to perform well on a variety of hardware, while also providing the security and reliability required for modern web applications.
If you want to read more about Nginx optimization, you can look at these great resources:
Nginx Installation and performance tuning (webinar; slides at Speaker Deck)
Performance Tuning for Nginx
Open Source Application Architecture--nginx
NGINX1.9.1 Socket Segmentation (Socket sharding) (using so_reuseport socket option)
Deep Nginx: How we design its performance and scalability