Nginx-event-driven mechanism (surprise group problem, Server Load balancer)

Source: Internet
Author: User
Tags epoll
Event framework processing process

Every worker sub-process processes process events cyclically in the ngx_worker_process_cycle method. To process distribution events, call the ngx_process_events_and_timers method in the ngx_worker_process_cycle method, and call this method cyclically to process all events, this is the core of the event-driven mechanism. This method processes both common network events and timer events.

In the ngx_process_events_and_timers method, the core operations are as follows:

1) Call the process_events method implemented by the event-driven module to process network events.

2) to process events in two post event queues, you actually call the ngx_event_process_posted (cycle, & ngx_posted_accept_events) and ngx_event_process_posted (cycle, & ngx_posted_events) methods respectively.

3) to process scheduled events, call the ngx_event_expire_timers () method.

The following is the time frame processing flowchart and source code in the ngx_process_events_and_timers method. It can be understood in combination:



The source code is as follows:

Cycle (ngx_cycle_t * cycle) {ngx_uint_t flags; ngx_msec_t timer, Delta;/* If the timer_resolution configuration item is used in the configuration file, the value of ngx_timer_resolution is greater than 0, it indicates that you want the server time precision to be ngx_timer_resolution in milliseconds. At this time, set the timer parameter of ngx_process_changes to-1, telling the ngx_process_change method not to wait during the detection time, directly collect all the ready time and then return the result. At the same time, the flag parameter is initialized to 0, it indicates that ngx_process_changes does not have any additional action */If (ngx_timer_resolution) {timer = ngx_timer_infinite; flags = 0;} else {/* If timer_resolution is not used, then callback () is called () method to obtain the number of milliseconds between the last trigger time and the current time. Then, assign this value to the timer parameter and tell the ngx_process_change method if no event exists during event detection, wait up to several milliseconds for the timer to return. Set the flag parameter to update_time to inform the ngx_process_change party. Update Time X/Timer = ngx_event_find_timer (); flags = ngx_update_time; # If (ngx_threads) if (timer = ngx_timer_infinite | timer> 500) {timer = 500 ;} # endif}/* ngx_use_accept_mutex indicates whether to lock the accept to solve the problem. When the number of nginx worker processes is greater than 1 and accept_mutex is enabled in the configuration file, this flag is set to 1 */If (ngx_use_accept_mutex) {/* ngx_accept_disabled indicates that the process is fully loaded, there is no need to process the new connection. We are working on nginx. conf has configured the maximum number of connections that can be processed by each nginx worker process. When the maximum number reaches 7/8, the ngx_accept_disabled is positive. This indicates that the nginx worker process is very busy and will no longer process new connections, this is also a simple Load Balancing */If (ngx_accept_disabled> 0) {ngx_accept_disabled --;} else {/* to obtain the accept lock. Only one worker can obtain the lock. The lock is returned immediately if it is not blocked. If the lock is obtained successfully, ngx_accept_mutex_held is set to 1. Obtaining the lock means that the listening handle is put into the epoll of the current process. If the lock is not obtained, the listening handle will be taken from epoll. */If (ngx_trylock_accept_mutex (cycle) = ngx_error) {return;}/* If the lock is obtained, set the flag to ngx_post_events. This means that any event in the ngx_process_events function will be processed later, the accept events will be put in the ngx_posted_accept_events linked list, epollin | epollout events will be put in the ngx_posted_events linked list */If (events) {flags | = ngx_post_events;} else {/* fails, this means that neither the current worker process can frequently attempt to obtain the lock, nor can it go through too long events to obtain the following code: Even if timer_resolution time precision is enabled, you need to make the ngx_process_change method wait for at least several milliseconds for the ngx_accept_mutex_delay method to try to grab the lock without enabling the time precision when there are no new events, if the timeout time of the last timer event exceeds ngx_accept_mutex_delay in milliseconds, set the timer to ngx_accept_mutex_delay in milliseconds. This is because the current process has not grabbed the accept_mutex lock, however, the ngx_process_change method cannot wait longer than ngx_accept_mutex_delay when there is no new event. This will affect the entire load balancing mechanism */If (timer = ngx_timer_infinite | timer> ngx_accept_mutex_delay) {timer = timer ;}}}// calculate the time consumed by ngx_process_events Delta = ngx_current_msec; // in Linux, call the ngx_epoll_process_events function to start processing (void) ngx_process_events (cycle, timer, flags); // function processing time consumed Delta = ngx_current_msec-delta; ngx_log_debug1 (ngx_log_debug_event, cycle-> log, 0, "timer delta: % m", Delta ); // If the ngx_posted_accept_events linked list has data, start to accept to establish a new connection if (ngx_posted_accept_events) {ngx_event_process_posted (cycle, & ngx_posted_accept_events );} // release the lock and then process the following epollin epollout request if (ngx_accept_mutex_held) {ngx_shmtx_unlock (& ngx_accept_mutex);} // if the time consumed by ngx_process_events is greater than 0, then a new timer event may trigger if (DELTA) {// process the timer event ngx_event_expire_timers ();} ngx_log_debug1 (ngx_log_debug_event, cycle-> log, 0, "posted events % P", ngx_posted_events); // If (ngx_posted_events) {If (ngx_threaded) {ngx_wakeup_worker_thread (cycle );} else {ngx_event_process_posted (cycle, & ngx_posted_events );}}}


Group surprise

When establishing a connection, nginx is fully considering the performance of the multi-core CPU architecture. Multiple worker sub-processes are used to listen to the same port, in this way, when multiple sub-processes establish a new connection with accept, the more sub-processes, the more obvious the number of sub-processes, resulting in a decline in system performance.


The master process starts to listen to the Web port and fork multiple worker sub-processes. These Sub-processes listen to the same Web port at the same time. Generally, the number of CPU cores is configured as many worker sub-processes, so that all worker sub-processes are assumed the role of the web server, so as to exert the power of multi-core machines. Assume that no user is connected to the server. at a certain time, all the sub-processes are sleeping and waiting for the system to call the new connection. At this time, a user initiates a connection to the server, when the kernel receives a tcp syn packet, it will activate all sleeping worker subprocesses. In the end, only the sub-processes that start to execute accept can successfully establish a new connection, while other worker sub-processes will fail the accept. It is unnecessary for these failed accept sub-processes to be awakened by the kernel, and their wake-up operations may be redundant. At this time, they occupy unnecessary resources, this causes unnecessary process switching and increases system overhead.


The latest kernel version of Many operating systems has already solved the problem of group alarms in the event-driven mechanism. However, nginx is a highly portable web server, this problem is better solved at the application layer. Since the Group is caused by listening to the same port by multiple sub-processes at the same time, the nginx solution is also very simple, it specifies that only one worker sub-process can listen to the Web port at the same time, so that it will not be surprised. At this time, the new connection time can only wake up the worker sub-process with the only listening port.


How can we limit that only one sub-process can listen to a Web port at a certain time? When the accept_mutex lock is enabled, the current Worker Process will attempt to listen to the Web port only after the ngx_trylock_accept_mutex method is called.

The specific implementation of this method is as follows:

Ngx_int_tngx_trylock_accept_mutex (ngx_cycle_t * cycle) {/* attempts to obtain accept_mutex by using the inter-process synchronization lock. NOTE: If ngx_trylock_accept_mutex is returned, 1 indicates that the lock is obtained successfully, and 0 indicates that the lock fails to be obtained. The process of obtaining the lock is non-blocking. Once the lock is occupied by other worker sub-processes, this method will return immediately. */If (ngx_shmtx_trylock (& ngx_accept_mutex) {ngx_log_debug0 (ngx_log_debug_event, cycle-> log, 0, "Accept mutex locked");/* If accept_mutex is obtained, but if ngx_accept_mutex_held is 1, return immediately. Ngx_accept_mutex_held is a flag. When it is set to 1, it indicates that the current process has obtained the lock */If (ngx_accept_mutex_held & ngx_accept_events = 0 &&! (Ngx_event_flags & ngx_use_rtsig_event) {// you have obtained the ngx_accept_mutex lock and return ngx_ OK immediately ;} // Add all listening Connection events to the current epoll and other event driver modules if (ngx_enable_accept_events (cycle) = ngx_error) {/* The ngx_accept_mutex lock */ngx_shmtx_unlock (& ngx_accept_mutex); Return ngx_error;}/* is called by the ngx_enable_accept_events method, the time driver module of the current process has started to listen to all ports. In this case, set the ngx_accept_mutex_heald flag to 1, it is convenient for other modules of this process to understand that it has obtained the lock */ngx_accept_events = 0; ngx_accept_mutex_held = 1; return ngx_ OK;}/* If ngx_shmtx_trylock returns 0, it indicates that the ngx_accept_mutex lock fails to be obtained. If the ngx_accept_mutex_held flag is still 1, that is, the current process is still acquiring the lock status. This is obviously incorrect and requires processing */ngx_log_debug_event, cycle-> log, 0, "Accept mutex lock failed: % UI", ngx_accept_mutex_held); If (ngx_accept_mutex_held) {/* ngx_disable_accept_events (read events of all listening connections will be removed from the event driver module */If (ngx_disable_accept_events (cycle) = ngx_error) {return ngx_error ;} /* If the ngx_accept_mutex lock is not obtained, you must set ngx_accept_mutex_held to 0 */ngx_accept_mutex_held = 0;} return ngx_ OK ;}

In the above Code, ngx_accept_mutex is the synchronization lock between processes (see the http://blog.csdn.net/walkerkalr/article/details/38237147), ngx_accept_mutex_held is a global variable of the current process, their definition is as follows:

ngx_shmtx_tngx_accept_mutex;ngx_uint_tngx_accept_mutex_held;

Therefore, after the ngx_try_accept_mutex method is called, if the lock is not obtained, the current process can only process events with existing connections when calling process_events. If the lock is obtained and its epoll and other event-driven modules start to monitor new Connection events on the Web port. In this case, when process_events is called, both the events on the existing connection and the events on the new connection are processed. But when will the ngx_accept_mutex lock be released? If all these events are completed, the worker process may have many active connections, so it takes a long time to process these Connection events. That is to say, the ngx_accept_mutex lock will not be released for a long time, so that other worker processes will have a rare opportunity to manage new connections everywhere.


How can I solve the problem of occupying ngx_accept_mutex for a long time? This depends on the ngx_posted_accept_events Queue (the queue for storing New Connection events) and the ngx_posted_events Queue (the queue for storing common events ). In fact, the upload queue and the ngx_posted_events queue classify the events so that the events in the ngx_posted_accept_events queue can be processed first. After processing, the ngx_accept_mutex lock will be released, and then the time in the ngx_posted_events, this greatly reduces the time occupied by the ngx_accept_mutex lock.


Server Load balancer

When establishing a connection, when multiple sub-processes compete for a new connection time, only one worker sub-process will eventually connect the resume, it will process the connection until the connection is closed. If some sub-processes are very diligent, they are eager to establish and handle most of the connections, while some sub-processes are not lucky and only deal with a small number of connections, this is very unfavorable for applications in multi-core CPU architecture, because sub-processes should be equal, and each sub-process should try to exclusively occupy one CPU core. Load imbalance between sub-processes will definitely affect the performance of the entire service.


Similar to the solution to the problem, only the accept_mutex lock can be opened to achieve load balancing between sub-processes. Here, a global variable ngx_accept_disabled is initialized, which is the key threshold value implemented by the load balancing mechanism. In fact, it is an integer data.

ngx_int_t             ngx_accept_disabled;


This threshold value is closely related to the use of connections in the connection pool. It is assigned a value when a connection is established, as shown below:

ngx_accept_disabled = ngx_cycle->connection_n / 8  - ngx_cycle->free_connection_n;

Therefore, this threshold value is a negative value at startup, and its absolute value is 7/8 of the total number of connections. In fact, the usage of ngx_accept_disabled is very simple. When it is a negative number, the Server Load balancer operation will not be triggered, and the accept lock is obtained normally, trying to process new connections. When ngx_accept_disabled is a positive value, nginx will be triggered to perform load balancing. nginx will not process New Connection events at this time, instead, the ngx_accept_disabled value will be reduced by 1 ,, this indicates that after a round of event processing, the relative load is definitely reduced, so we need to adjust this value accordingly. As shown below

If (ngx_accept_disabled> 0) {ngx_accept_disabled --;} else {// call the authorization method and try to obtain the accept lock if (ngx_trylock_accept_mutex (cycle) = ngx_error) {return ;}

Server Load balancer between workers in nginx is triggered only when the number of connections processed by a worker process reaches 7/8 of the maximum number of processes, at this time, the worker process will reduce the chance to process new connections, so that other idle worker processes will have the opportunity to process more new connections to achieve a balanced effect on the entire web server.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.