The thread pool is a core feature of Mysql5.6, where high concurrent requests are always a topic for server applications, whether it is a Web application service or a DB service. When there are a large number of requests for concurrent access, it must accompany the constant creation and release of resources, resulting in low resource utilization and reduced service quality. The thread pool is a generic technique that, by pre-creating a certain number of threads, allocates a thread to serve the service when a request is reached, and then goes to service other requests when the request is finished. In this way, it avoids the frequent creation and release of Threads and memory objects, reduces the concurrency of the server, reduces the competition between the context and resources, and improves the efficiency of resource utilization. The thread pool of all services is essentially a bit more efficient in resource utilization and is implemented in roughly the same way. This article mainly explains the implementation principle of MySQL thread pool.
Before the advent of Mysql5.6, Mysql handled the connection in One-connection-per-thread, that is, for each database connection, Mysql-server creates a separate thread service that, after the request is finished, destroys the thread. One more connection request, then create a connection, and then destroy the end. This approach leads to frequent creation and release of threads in high concurrency situations. Of course, with Thread-cache, we can cache threads for next use, avoid frequently created and released problems, but cannot solve the problem of high number of connections. One-connection-per-thread mode as the number of connections increases, resulting in the need to create the same number of service threads, high concurrent threads mean high memory consumption, more context switching (decreased CPU cache hit rate), and more resource contention, resulting in service jitter. In relation to one-thread-per-connection mode, a thread corresponds to a connection, in Thread-pool implementation, the minimum unit of thread processing is statement (statement), and one thread can handle multiple connection requests. In this way, the server jitter caused by the sudden increase in the number of transient connections can be avoided by ensuring that the hardware resources are fully utilized (setting the thread pool size reasonably).
Scheduling method implementation
Mysql-server supports 3 connection management methods, including No-threads,one-thread-per-connection and Pool-threads. No-threads means that the processing connection uses the main thread processing, does not create additional threads, this method is mainly used for debugging; One-thread-per-connection is the most common way that the thread pool appears before, creating a thread service for each connection Pool-threads is the thread pool approach discussed in this article. Mysql-server is supported by a set of function pointers to support 3 connection management methods, for a specific way, the function pointer is set to a specific callback function, the connection management mode through the Thread_handling parameter control, the code is as follows:
if (thread_handling <= scheduler_one_thread_per_connection) One_thread_per_connection_scheduler (thread_ Scheduler, &max_connections, &connection_count); else if (thread_handling = = scheduler_no_threads) One_thread_scheduler (Thread_scheduler); else Pool_of_threads_scheduler (Thread_scheduler, &max_ Connections,&connection_count);
Connection management Process
- Connection request to the MySQL port via poll monitoring
- After receiving the connection, call the Accept interface to create the communication socket
- Initialize the THD instance, vio objects, etc.
- Initializes the scheduler function pointer of the THD instance according to the thread_handling mode setting
- Call Scheduler specific add_connection function to create a new connection
The following code shows the implementation of the template callback function for the scheduler_functions template and the thread pool, which is the core of various connection management.
struct Scheduler_functions {uint max_threads;uint *connection_count; ulong *max_connections; bool (*init) (void); BOOL (* Init_new_connection_thread) (void), void (*add_connection) (THD *thd), Void (*thd_wait_begin) (THD *thd, int wait_type); void (*thd_wait_end) (THD *thd), Void (*post_kill_notification) (THD *THD), BOOL (*end_thread) (THD *THD, BOOL Cache_thread ); void (*end) (void);};
Static scheduler_functions tp_scheduler_functions={0,//max_threadsnull,null, Tp_init,//Initnull,//Init_new_ Connection_threadtp_add_connection,//Add_connectiontp_wait_begin,//Thd_wait_begin tp_wait_end,//THD_WAIT_ENDTP_ Post_kill_notification,//Post_kill_notification NULL,//End_threadtp_end//end};
Related parameters of the thread pool
- Thread_handling: Represents the thread pool model.
- Thread_pool_size: Represents the number of group of thread pools, typically set to the current CPU cores. Ideally, a group is an active worker thread that is designed to take full advantage of the CPU.
- Thread_pool_stall_limit: Used by the timer thread to periodically check whether the group is "stuck" and the parameter represents the interval of detection.
- Thread_pool_idle_timeout: When a worker is idle for a period of time, it automatically exits, ensuring that the worker threads in the thread pool remain at a lower level when the request is satisfied.
- Thread_pool_oversubscribe: This parameter is used to control the number of "overclocking" threads on the CPU core. This parameter setting value does not include the listen thread count.
- Threadpool_high_prio_mode: The pattern that represents the priority queue.
Thread Pool Implementation
This section describes how Mysql-server manages the connection, which focuses on the implementation framework of the thread pool, as well as the key interfaces. 1
Figure 1 (thread pool frame chart)
Each green box represents a group,group number determined by the Thread_pool_size parameter. Each group contains a priority queue and a normal queue that contains a listener thread and several worker threads, listener threads and worker threads can be dynamically converted, and the number of worker threads is determined by the workload, and is subject to Thread_pool_ Oversubscribe setting effect. In addition, the entire thread pool has a timer thread monitoring group to prevent group "stagnation".
Key interfaces
1. tp_add_connection[processing new connections]
1) Create a Connection object
2) determine which group the connection is assigned to according to Thread_id%group_count
3) Place the connection in the corresponding group queue
4) If the current number of active threads is 0, create a worker thread
2. worker_main[worker thread]
1) Call Get_event to get the request
2) If a request is present, call handle_event for processing
3) Otherwise, indicates that no request has been made in the queue and the exit ends.
3. get_event[GET request]
1) Get a connection request
2) If present, return immediately, end
3) If there is no listener within the group, the thread is converted to listener thread, blocking waits
4) If listener is present, the thread is added to the waiting queue header
5) thread hibernation specified time (thread_pool_idle_timeout)
6) If it is still not woken up, it is timed out, then the thread ends, the end exits
7) Otherwise, indicates a connection request arrives in the queue, jump 1
Note: Before obtaining a connection request, you will determine whether the current number of active threads exceeds the
Thread_pool_oversubscribe+1, if it is exceeded, the thread is put into hibernation.
4. handle_event[Processing Request]
1) Determine if the connection is authenticated by login, if not, then login verification
2) associated THD instance information
3) Get network packets, analyze requests
4) Call the Do_command function to loop the request
5) Get the socket handle of the THD instance and determine if the handle is in the Epoll listener list
6) If not, call Epoll_ctl to associate
7) End
5.listener[Listener Thread]
1) Call epoll_wait to listen for the socket associated with the group, blocking the wait
2) If the request arrives, recover from the blocking
3) Depending on the priority level of the connection, determine whether to put in a normal queue or a priority queue
4) Determine if the task in the queue is empty
5) If the queue is empty, the listener is converted to a worker thread
6) If there is no active thread in group, wake up a thread
Note: Here epoll_wait listens to all the connected sockets in the group, and then hears the connection
Request push to the queue, the worker thread gets the task from the queue, and then executes.
6. timer_thread[Monitoring Thread]
1) If there are no listener threads, and there are no io_event events recently
2) Create a wake-up or create a worker thread
3) If the group has not processed the request in the last time and there is a request in the queue,
4) indicates that the group has stall, then wakes or creates the thread
5) Check if there is a connection timeout
Note: The timer thread determines whether the group is in the stall state by calling Check_stall and checks to see if the client connection timed out by calling Timeout_check.
7.tp_wait_begin[Enter the wait state process]
1) active_thread_count minus 1,waiting_thread_count plus 1
2) Set connection->waiting= true
3) If the number of active threads is 0, and the task queue is not empty, or there is no listener thread, the
4) Wake up or create a thread
8.tp_wait_end[end wait state flow]
1) Set the waiting status of connection to False
2) active_thread_count plus 1,waiting_thread_count minus 1
Note:
1) waiting_threads The thread in this list is an idle thread, not a waiting thread, the so-called idle thread is a thread that can handle a task at any time, while a waiting thread is a thread that waits for a lock, or waits for an IO operation to process a task.
2) The main role of Tp_wait_begin and Tp_wait_end is due to reporting status, even when updating active_thread_count and waiting_thread_count information.
9. Tp_init/tp_end
Call Thread_group_init and Thread_group_close separately to initialize and destroy the thread pool
thread pool and connection pool
Connection pooling is typically implemented on the client side, meaning that the application (client) creates a pre-created connection that leverages these connections to serve all DB requests from the client. If, at some point, the number of idle connections is less than the number of DB requests, you need to queue the request and wait for the idle connection to process. Connection pooling allows you to reuse connections, avoid frequent creation and release of connections, reduce the average response time of requests, and, when requests are busy, queue up requests to buffer the application's impact on the db. thread pool implementation on the server side, by creating a certain number of thread service DB requests, the minimum unit of the thread pool service is a statement, that is, a thread can correspond to multiple active connections, relative to the one-conection-per-thread of a thread serving a connection. Through the thread pool, you can control the number of server-side service threads to a certain extent, reduce the competition of system resources and the consumption of thread context switch, and avoid high concurrency problems caused by high connection number. Connection pooling and thread pooling complement each other through connection pooling to reduce the creation and release of connections, increase the average response time of requests, and control the number of DB connections for an application, but with no control over the size of connections across the entire application cluster, resulting in high connections and a good deal of high connections through the thread pool, Ensure the server side can provide stable service. As shown in 2, each Web-server side maintains 3 connected pools of connections, and each connection to the connection pool is not actually a worker of the exclusive db-server, but may be shared with other connections. This assumes that Db-server has only 3 group, each group has only one worker, and each worker processes 2 connection requests.
Figure 2 (Connection pool and thread pool frame diagram)
Thread pool Optimization
1. Scheduling Deadlock resolution
The introduction of thread pool solves the problem of multithreading high concurrency, but also brings a hidden trouble. Suppose that a, B two transactions are assigned to different group execution, a transaction has started, and hold the lock, but because the group A is busy, cause a to execute a statement, the execution of a sentence can not immediately get scheduled execution, and B transaction depends on a transaction release lock resources, although B transaction can be dispatched, However, because the lock resource cannot be obtained, it still needs to wait, which is called the dispatch deadlock. Because a group handles multiple connections at the same time, multiple connections are not equivalent. For example, some connections are the first to send a request, while some connections correspond to a transaction that is already open and holds a partial lock resource. In order to reduce lock resource contention, the latter should obviously be prioritized over the former to achieve the goal of releasing lock resources as early as possible. Therefore, within the group, a priority queue can be added, a connection that has already held a lock, or an open transaction is placed in a priority queue, and the worker thread first gets the task execution from the priority queue.
2. Large query processing
Suppose a scenario, a group inside the connection is a large query, then the number of working threads within the group will soon reach the Thread_pool_oversubscribe parameter setting value, for subsequent connection requests, will respond to not timely (no more connections to process), This is when group stall. By the previous analysis, the timer thread checks this situation periodically and creates a new worker thread to process the request. If a long query originates from a business request, all group faces this problem at this time, and the host may be stuck due to heavy load. In this case, the thread pool itself is powerless because the source may be bad SQL concurrency, or SQL does not go to the execution plan, and other methods, such as SQL high-level current limit or SQL filtering can be used for emergency processing. However, there is another situation that is the dump task. Many downstream dependent on the raw data of the database, usually through the dump command to pull the data downstream, and this dump task is usually time-consuming, so it can also be considered a large query. If the dump task is concentrated within a group and causes other normal business requests to fail to respond immediately, this is intolerable because the database is not under pressure, only because the thread pool policy is used to cause the request to respond less, in order to solve the problem, We avoid the above problem by not counting the threads in the group that handle the dump task into the Thread_pool_oversubscribe cumulative value.
Copyright NOTICE: Welcome to reprint, hope to reprint the same time add the original address, thank you for your cooperation, learning happy!
MySQL detailed (7)-----------MySQL thread pool summary (i)