The thread pool is a core feature of Mysql5.6, where high concurrent requests are always a topic for server applications, whether it is a Web application service or a DB service. When there are a large number of requests for concurrent access, it must accompany the constant creation and release of resources, resulting in low resource utilization and reduced service quality. The thread pool is a generic technique that, by pre-creating a certain number of threads, allocates a thread to serve the service when a request is reached, and then goes to service other requests when the request is finished. In this way, it avoids the frequent creation and release of Threads and memory objects, reduces the concurrency of the server, reduces the competition between the context and resources, and improves the efficiency of resource utilization. The thread pool of all services is essentially a bit more efficient in resource utilization and is implemented in roughly the same way. This article mainly explains the implementation principle of MySQL thread pool.
Before the advent of Mysql5.6, Mysql handled the connection in One-connection-per-thread, that is, for each database connection, Mysql-server creates a separate thread service that, after the request is finished, destroys the thread. One more connection request, then create a connection, and then destroy the end. This approach leads to frequent creation and release of threads in high concurrency situations. Of course, with Thread-cache, we can cache threads for next use, avoid frequently created and released problems, but cannot solve the problem of high number of connections. One-connection-per-thread mode as the number of connections increases, resulting in the need to create the same number of service threads, high concurrent threads mean high memory consumption, more context switching (decreased CPU cache hit rate), and more resource contention, resulting in service jitter. In relation to one-thread-per-connection mode, a thread corresponds to a connection, in Thread-pool implementation, the minimum unit of thread processing is statement (statement), and one thread can handle multiple connection requests. In this way, the server jitter caused by the sudden increase in the number of transient connections can be avoided by ensuring that the hardware resources are fully utilized (setting the thread pool size reasonably).
Scheduling method implementation
Mysql-server supports 3 connection management methods, including No-threads,one-thread-per-connection and Pool-threads. No-threads means that the processing connection uses the main thread processing, does not create additional threads, this method is mainly used for debugging; One-thread-per-connection is the most common way that the thread pool appears before, creating a thread service for each connection Pool-threads is the thread pool approach discussed in this article. Mysql-server is supported by a set of function pointers to support 3 connection management methods, for a specific way, the function pointer is set to a specific callback function, the connection management mode through the Thread_handling parameter control, the code is as follows:
if (thread_handling <= scheduler_one_thread_per_connection) One_thread_per_connection_scheduler (thread_ Scheduler, &max_connections, &connection_count); else if (thread_handling = = scheduler_no_threads) One_thread_scheduler (Thread_scheduler); else Pool_of_threads_scheduler (Thread_scheduler, &max_ Connections,&connection_count);
Connection management Process
- Connection request to the MySQL port via poll monitoring
- After receiving the connection, call the Accept interface to create the communication socket
- Initialize the THD instance, vio objects, etc.
- Initializes the scheduler function pointer of the THD instance according to the thread_handling mode setting
- Call Scheduler specific add_connection function to create a new connection
The following code shows the implementation of the template callback function for the scheduler_functions template and the thread pool, which is the core of various connection management.
struct Scheduler_functions {uint max_threads;uint *connection_count; ulong *max_connections; bool (*init) (void); BOOL (* Init_new_connection_thread) (void), void (*add_connection) (THD *thd), Void (*thd_wait_begin) (THD *thd, int wait_type); void (*thd_wait_end) (THD *thd), Void (*post_kill_notification) (THD *THD), BOOL (*end_thread) (THD *THD, BOOL Cache_thread ); void (*end) (void);};
Static scheduler_functions tp_scheduler_functions={0,//max_threadsnull,null, Tp_init,//Initnull,//Init_new_ Connection_threadtp_add_connection,//Add_connectiontp_wait_begin,//Thd_wait_begin tp_wait_end,//THD_WAIT_ENDTP_ Post_kill_notification,//Post_kill_notification NULL,//End_threadtp_end//end};
Related parameters of the thread pool
- Thread_handling: Represents the thread pool model.
- Thread_pool_size: Represents the number of group of thread pools, typically set to the current CPU cores. Ideally, a group is an active worker thread that is designed to take full advantage of the CPU.
- Thread_pool_stall_limit: Used by the timer thread to periodically check whether the group is "stuck" and the parameter represents the interval of detection.
- Thread_pool_idle_timeout: When a worker is idle for a period of time, it automatically exits, ensuring that the worker threads in the thread pool remain at a lower level when the request is satisfied.
- Thread_pool_oversubscribe: This parameter is used to control the number of "overclocking" threads on the CPU core. This parameter setting value does not include the listen thread count.
- Threadpool_high_prio_mode: The pattern that represents the priority queue.
Thread Pool Implementation
This section describes how Mysql-server manages the connection, which focuses on the implementation framework of the thread pool, as well as the key interfaces. 1
Figure 1 (thread pool frame chart)
Each green box represents a group,group number determined by the Thread_pool_size parameter. Each group contains a priority queue and a normal queue that contains a listener thread and several worker threads, listener threads and worker threads can be dynamically converted, and the number of worker threads is determined by the workload, and is subject to Thread_pool_ Oversubscribe setting effect. In addition, the entire thread pool has a timer thread monitoring group to prevent group "stagnation".
Key interfaces
1. tp_add_connection[processing new connections]
1) Create a Connection object
2) determine which group the connection is assigned to according to Thread_id%group_count
3) Place the connection in the corresponding group queue
4) If the current number of active threads is 0, create a worker thread
2. worker_main[worker thread]
1) Call Get_event to get the request
2) If a request is present, call handle_event for processing
3) Otherwise, indicates that no request has been made in the queue and the exit ends.
3. get_event[GET request]
1) Get a connection request
2) If present, return immediately, end
3) If there is no listener within the group, the thread is converted to listener thread, blocking waits
4) If listener is present, the thread is added to the waiting queue header
5) thread hibernation specified time (thread_pool_idle_timeout)
6) If it is still not woken up, it is timed out, then the thread ends, the end exits
7) Otherwise, indicates a connection request arrives in the queue, jump 1
Note: Before obtaining a connection request, you will determine whether the current number of active threads exceeds the
Thread_pool_oversubscribe+1, if it is exceeded, the thread is put into hibernation.
4. handle_event[Processing Request]
1) Determine if the connection is authenticated by login, if not, then login verification
2) associated THD instance information
3) Get network packets, analyze requests
4) Call the Do_command function to loop the request
5) Get the socket handle of the THD instance and determine if the handle is in the Epoll listener list
6) If not, call Epoll_ctl to associate
7) End
5.listener[Listener Thread]
1) Call epoll_wait to listen for the socket associated with the group, blocking the wait
2) If the request arrives, recover from the blocking
3) Depending on the priority level of the connection, determine whether to put in a normal queue or a priority queue
4) Determine if the task in the queue is empty
5) If the queue is empty, the listener is converted to a worker thread
6) If there is no active thread in group, wake up a thread
Note: Here epoll_wait listens to all the connected sockets in the group, and then hears the connection
Request push to the queue, the worker thread gets the task from the queue, and then executes.
6. timer_thread[Monitoring Thread]
1) If there are no listener threads, and there are no io_event events recently
2) Create a wake-up or create a worker thread
3) If the group has not processed the request in the last time and there is a request in the queue,
4) indicates that the group has stall, then wakes or creates the thread
5) Check if there is a connection timeout
Note: The timer thread determines whether the group is in the stall state by calling Check_stall and checks to see if the client connection timed out by calling Timeout_check.
7.tp_wait_begin[Enter the wait state process]
1) active_thread_count minus 1,waiting_thread_count plus 1
2) Set connection->waiting= true
3) If the number of active threads is 0, and the task queue is not empty, or there is no listener thread, the
4) Wake up or create a thread
8.tp_wait_end[end wait state flow]
1) Set the waiting status of connection to False
2) active_thread_count plus 1,waiting_thread_count minus 1
Note:
1) waiting_threads The thread in this list is an idle thread, not a waiting thread, the so-called idle thread is a thread that can handle a task at any time, while a waiting thread is a thread that waits for a lock, or waits for an IO operation to process a task.
2) The main role of Tp_wait_begin and Tp_wait_end is due to reporting status, even when updating active_thread_count and waiting_thread_count information.
9. Tp_init/tp_end
Call Thread_group_init and Thread_group_close separately to initialize and destroy the thread pool
thread pool and connection pool
Connection pooling is typically implemented on the client side, meaning that the application (client) creates a pre-created connection that leverages these connections to serve all DB requests from the client. If, at some point, the number of idle connections is less than the number of DB requests, you need to queue the request and wait for the idle connection to process. Connection pooling allows you to reuse connections, avoid frequent creation and release of connections, reduce the average response time of requests, and, when requests are busy, queue up requests to buffer the application's impact on the db. thread pool implementation on the server side, by creating a certain number of thread service DB requests, the minimum unit of the thread pool service is a statement, that is, a thread can correspond to multiple active connections, relative to the one-conection-per-thread of a thread serving a connection. Through the thread pool, you can control the number of server-side service threads to a certain extent, reduce the competition of system resources and the consumption of thread context switch, and avoid high concurrency problems caused by high connection number. Connection pooling and thread pooling complement each other through connection pooling to reduce the creation and release of connections, increase the average response time of requests, and control the number of DB connections for an application, but with no control over the size of connections across the entire application cluster, resulting in high connections and a good deal of high connections through the thread pool, Ensure the server side can provide stable service. As shown in 2, each Web-server side maintains 3 connected pools of connections, and each connection to the connection pool is not actually a worker of the exclusive db-server, but may be shared with other connections. This assumes that Db-server has only 3 group, each group has only one worker, and each worker processes 2 connection requests.
Figure 2 (Connection pool and thread pool frame diagram)
Thread pool Optimization
1. Scheduling Deadlock resolution
The introduction of thread pool solves the problem of multithreading high concurrency, but also brings a hidden trouble. Suppose that a, B two transactions are assigned to different group execution, a transaction has started, and hold the lock, but because the group A is busy, cause a to execute a statement, the execution of a sentence can not immediately get scheduled execution, and B transaction depends on a transaction release lock resources, although B transaction can be dispatched, However, because the lock resource cannot be obtained, it still needs to wait, which is called the dispatch deadlock. Because a group handles multiple connections at the same time, multiple connections are not equivalent. For example, some connections are the first to send a request, while some connections correspond to a transaction that is already open and holds a partial lock resource. In order to reduce lock resource contention, the latter should obviously be prioritized over the former to achieve the goal of releasing lock resources as early as possible. Therefore, within the group, a priority queue can be added, a connection that has already held a lock, or an open transaction is placed in a priority queue, and the worker thread first gets the task execution from the priority queue.
2. Large query processing
Suppose a scenario, a group inside the connection is a large query, then the number of working threads within the group will soon reach the Thread_pool_oversubscribe parameter setting value, for subsequent connection requests, will respond to not timely (no more connections to process), This is when group stall. By the previous analysis, the timer thread checks this situation periodically and creates a new worker thread to process the request. If a long query originates from a business request, all group faces this problem at this time, and the host may be stuck due to heavy load. In this case, the thread pool itself is powerless because the source may be bad SQL concurrency, or SQL does not go to the execution plan, and other methods, such as SQL high-level current limit or SQL filtering can be used for emergency processing. However, there is another situation that is the dump task. Many downstream dependent on the raw data of the database, usually through the dump command to pull the data downstream, and this dump task is usually time-consuming, so it can also be considered a large query. If the dump task is concentrated within a group and causes other normal business requests to fail to respond immediately, this is intolerable because the database is not under pressure, only because the thread pool policy is used to cause the request to respond less, in order to solve the problem, We avoid the above problem by not counting the threads in the group that handle the dump task into the Thread_pool_oversubscribe cumulative value.
One-connection-per-thread
Depending on the scheduler_functions template, we can also list several key functions of the one-connection-per-thread approach.
Static Scheduler_functions con_per_functions={max_connection+1,//Max_threadsnull,null,null,//InitInit_new_ Connection_handler_thread,//Init_new_connection_threadcreate_thread_to_handle_connection,//Add_connectionNULL, Thd_wait_beginnull,//Thd_wait_endnull,//Post_kill_notificationone_thread_per_connection_end,//End_threadNULL END};
1.init_new_connection_handler_thread
This interface is relatively simple, mainly called Pthread_detach, the thread is set to the detach state, the thread ends automatically releasing all resources.
2.create_thread_to_handle_connection
This interface is the interface that handles the new connection, and for the thread pool, a thread is fetched from the thread_id%group_size corresponding group, and the One-connection-per-thread method will determine if there is thread_ The cache can be used if there is no new thread to process. The specific logic is as follows:
(1). Determine if the number of threads cached is exhausted (compare blocked_pthread_count and Wake_pthread sizes)
(2). If there is a cache thread, add the THD to the Waiting_thd_list queue and wake up a thread waiting for Cond_thread_cache
(3). If not, create a new threading process, and the thread's entry function is Do_handle_one_connection
(4). Call Add_global_thread to join the THD array.
3.do_handle_one_connection
This interface is called by Create_thread_to_handle_connection to handle the main implementation interface of the request.
(1). Loop call Do_command, read the network packet from the socket, and parse execution;
(2). Exit the loop when the remote client sends a close connection command (such as Com_quit,com_shutdown)
(3). Call Close_connection to close the connection (Thd->disconnect ());
(4). Call the One_thread_per_connection_end function to confirm that the thread can be reused
(5). Depending on the results returned, determine whether to exit the worker thread or continue the Loop execution command.
4.one_thread_per_connection_end
The main function that determines whether the thread (Thread_cache) can be reused is as follows:
(1). Call Remove_global_thread to remove the corresponding THD instance of the thread
(2). Call block_until_new_connection to determine if thread can be reused
(3). Determine if the cached thread exceeds the threshold, and if not, then blocked_pthread_count++;
(4). Blocking Wait condition variable Cond_thread_cache
(5). After being awakened, a new THD is required to reuse the thread, remove the THD from the waiting_thd_list, and use the THD to initialize the thread's Thd->thread_stack
(6). Call Add_global_thread to join the THD array.
(7). Returns False if it can be reused, otherwise returns ture
Thread pool and Epoll
Before the thread pool was introduced, there was only one listener thread in the server layer, which was responsible for listening to MySQL ports and local unixsocket requests, and for each new connection, a separate thread was allocated for processing, so the task of listening to threads was relatively easy, MySQL uses the poll or select mode to implement IO multiplexing. After the thread pool is introduced, each group has a listener thread that listens for all connection requests to the socket in the group, in addition to the listener threads in the server layer, and the worker thread is not responsible for listening, only processing requests. For Overscribe 1000 thread pool settings, each listener thread needs to listen for 1000 socket requests, and the listener thread uses Epoll mode for monitoring.
Select,poll,epoll are all IO multiplexing mechanisms, IO multiplexing through a mechanism to listen to multiple FD (descriptors), such as the socket, once an FD ready (read ready or write Ready), can notify the program to read and write operations. Epoll compared to select and poll has a great improvement, first epoll through the EPOLL_CTL function registration, registration, all FD copy into the kernel, copy only once do not need duplicate copies, and each call poll or SELECT, The FD collection needs to be copied from the user space to the kernel space (Epoll through epoll_wait); second, the epoll specifies a callback function for each descriptor, and when the device is ready, the wake-up waiting person, through the callback function, adds the descriptor to the ready list, without the need for a SELECT, Poll mode adopts polling method; The last select supports only 1024 fd,epoll, and there is no limit, and the number can be referenced by the Cat/proc/sys/fs/file-max settings. Epoll throughout the process of using the thread pool, I epoll the creation, use, and destruction lifecycle to describe how epoll is used in the thread.
- Thread pool initialization, epoll through Epoll_create function to create Epoll file descriptor, the implementation function is thread_group_init;
- Port listening line Cheng after hearing the request, create the socket and create the THD and connection objects, which are placed in the corresponding group queue;
- When the worker thread obtains the connection object, the login verification is performed if it is not yet logged in
- If the socket is not yet registered to Epoll, call Epoll_ctl to register, Epoll_ctl_add, and put the connection object in the Epoll_event structure
- If the old connection request, still need to call EPOLL_CTL registration, the registration method is Epoll_ctl_mod
- The listener thread within the group calls Epoll_wait to listen for registered fd,epoll is a synchronous io, so it waits
- When the request arrives, get the connection in the epoll_event struct and put it into the queue in group
- When the thread pool is destroyed, call Thread_group_close to close epoll.
Note:
1. In Epoll, if the request is ready, the corresponding event is placed in the events array and the transaction type of the FD is emptied, so for the old connection request, the Epoll_ctl (POLLFD, Epoll_ctl_mod, FD) still needs to be called. &ev) to register.
Thread pool function call relationship
(1) Create Epoll
Tp_init->thread_group_init->tp_set_threadpool_size->io_poll_create->epoll_create
(2) Close Epoll
Tp_end->thread_group_close->thread_group_destroy->close (POLLFD)
(3) Associated socket descriptor
Handle_event->start_io->io_poll_associate_fd->io_poll_start_read->epoll_ctl
(4) Processing connection requests
Handle_event->threadpool_process_request->do_command->dispatch_command->mysql_parse->mysql_ Execute_command
(5) When the worker thread is idle
Worker_main->get_event->pthread_cond_timedwait
After waiting for Thread_pool_idle_timeout, exit.
(6) Monitor Epoll
Worker_main->get_event->listener->io_poll_wait->epoll_wait
(7) Port listener thread
Main->mysqld_main->handle_connections_sockets->poll
One-connection-per-thread function Call Relationship
(1) Worker thread waits for request
Handle_one_connection->do_handle_one_connection->do_command->
My_net_read->net_read_packet->net_read_packet_header->net_read_raw_loop->
Vio_read->vio_socket_io_wait->vio_io_wait->poll
Note: The worker thread with the thread pool has a listener thread to help its listening requests, and the one-connection-per-thread way of working threads is called poll blocking waits for a network packet when idle;
The worker thread of the thread pool only needs to concentrate on the request, so it is more fully used.
(2) Port listener thread
Same as the thread pool (7)
Reference documents
Http://www.codeceo.com/article/mysql-thread-02.html
Http://www.cnblogs.com/Anker/p/3265058.html
http://blog.csdn.net/zhanglu5227/article/details/7960677
MySQL thread pool summary