The thread pool principle of Mysql Learning tutorial _mysql

Source: Internet
Author: User
Tags connection pooling epoll socket

The thread pool is a core function of Mysql5.6, and high concurrent requests are a constant topic for server applications, whether it be Web application services or DB services. When a large number of requests concurrent access, must accompany the continuous creation and release of resources, resulting in low resource utilization, reduce the quality of service. A thread pool is a common technique by which a thread pool assigns a service to a thread that, when a request is reached, is created by a certain number of threads, and then it serves another request when the request is completed. In this way, it avoids the frequent creation and release of Threads and memory objects, reduces the concurrency of the server, reduces the context switching and resource competition, and improves the efficiency of resource utilization. The thread pool nature of all services is a bit of an increase in the efficiency of resource utilization, and the implementation of the same approach is generally the same. This article mainly explains the realization principle of the MySQL thread pool.

Before the advent of Mysql5.6, Mysql handled the connection by One-connection-per-thread, that is, for each database connection, Mysql-server creates a separate thread service that destroys the thread after the request is completed. Another connection request, then create a connection, the end of the destruction. This approach can lead to frequent creation and release of threads in high concurrency situations. Of course, through Thread-cache, we can cache threads for the next use, avoid frequently created and released problems, but not solve the problem of high connection number. The One-connection-per-thread way as the number of connections increases, causing the need to create as many service threads, high concurrent threading means high memory consumption, more context switching (lower CPU cache hit Ratio), and more resource competition, causing service jitter. Relative to the one-thread-per-connection way, a thread corresponds to a connection, thread-pool implementation, the smallest unit of thread processing is statement (statement), a thread can handle multiple connection requests. In this way, the server jitter caused by the sudden increase in number of connections can be avoided when the hardware resources are fully utilized (the thread pool size is reasonably set).

Scheduling mode Implementation

Mysql-server also supports 3 types of connection management, including No-threads,one-thread-per-connection and Pool-threads. No-threads means that the processing connection uses the main thread process, no additional threads are created, which is used primarily for debugging, and One-thread-per-connection is the most commonly used way before the thread pool appears, creating a thread service for each connection ; Pool-threads is the thread pool approach discussed in this article. Mysql-server through a set of function pointers to support 3 kinds of connection management at the same time, for a specific way, the function pointer set to a specific callback function, the connection management method through the Thread_handling parameter control, the code is as follows:

if (thread_handling <= scheduler_one_thread_per_connection)  
 One_thread_per_connection_scheduler (thread_ Scheduler,
                   &max_connections,
                   &connection_count);
else if (thread_handling = = scheduler_no_threads)
 one_thread_scheduler (thread_scheduler);
else                
 Pool_of_threads_scheduler (Thread_scheduler, &max_connections,&connection_count);
 

Connection management Process

By poll the connection request listening to the MySQL port
When the connection is received, call the Accept interface, create the communication socket
Initialize THD instance, Vio object, etc.
, set according to the Thread_handling method. Initializes the scheduler function pointer of the THD instance
Invoke Scheduler specific add_connection function New connection
The following code shows the implementation of the template callback function by the scheduler_functions template and thread pool. This is the core of multiple connection management.

struct scheduler_functions 
{ 
uint max_threads;
 
UINT *connection_count; 
 
ULONG *max_connections; 
 
BOOL (*init) (void); 
 
BOOL (*init_new_connection_thread) (void);
 
void (*add_connection) (THD *thd);
 
void (*thd_wait_begin) (THD *thd, int wait_type);
 
void (*thd_wait_end) (THD *THD);
 
void (*post_kill_notification) (THD *thd);
 
BOOL (*end_thread) (THD *THD, bool cache_thread);
 
void (*end) (void);

Static Scheduler_functions tp_scheduler_functions=
 
{ 
0, 
//Max_threads
null,
null, 
Tp_init, 
//init
NULL, 
//Init_new_connection_thread
tp_add_connection, 
//add_ Connection
Tp_wait_begin, 
//Thd_wait_begin 
tp_wait_end, 
//Thd_wait_end
Tp_post_kill_ notification, 
//Post_kill_notification 
NULL, 
//End_thread
tp_end 
//End
 
};

Parameters associated with the thread pool

    • Thread_handling: Represents a thread pool model.
    • Thread_pool_size: Represents the number of group of thread pools, typically set to the current number of CPU cores. Ideally, a group is an active worker thread that achieves the full use of the CPU.
    • Thread_pool_stall_limit: Used by the timer thread to periodically check whether the group is "stagnant", and the parameters represent the interval for detection.
    • Thread_pool_idle_timeout: When a worker is idle for a period of time, it exits automatically, ensuring that worker threads in the thread pool remain at a lower level when the request is satisfied.
    • Thread_pool_oversubscribe: This parameter is used to control the number of "overclocking" threads on the CPU core. This parameter setting value does not contain a listen thread count.
    • Threadpool_high_prio_mode: The mode that represents the priority queue.

Thread Pool Implementation

It describes how Mysql-server manages the connection, which focuses on the implementation framework of the thread pool and the key interfaces. As shown in Figure 1

Each green box represents a group,group number determined by the Thread_pool_size parameter. Each group consists of a priority queue and a normal queue, consisting of a listener thread and several worker threads, listener and worker threads can be dynamically converted, and the number of worker threads is determined by the workload, and the Thread_pool_ Oversubscribe setting effects. In addition, the entire thread pool has a timer thread monitoring group to prevent group "stagnation."

Key interface

1. tp_add_connection[processing New Connection]

1 Create a Connection object

2) According to Thread_id%group_count determine connection assigned to which group

3 Put the connection into the queue of the corresponding group

4 If the current active thread number is 0, create a worker thread

2. worker_main[worker Threads]

1) Call Get_event Fetch request

2 If there is a request, call handle_event for processing

3 Otherwise, indicates that there is no request in the queue and the exit ends.

3. get_event[Obtain request]

1) Get a connection request

2) If present, return immediately, end

3 If there is no listener within the group, the thread is converted to a listener thread, blocking the wait

4 If there is listener, the thread is added to the waiting queue head

5 thread Hibernate specified time (thread_pool_idle_timeout)

6 If still not awakened, is timed out, then the thread ends, the end exits

7) Otherwise, indicates that there is a connection request in the queue, jump 1

Note: Before obtaining a connection request, you will determine whether the current number of active threads exceeds the

Thread_pool_oversubscribe+1, if it is exceeded, the thread is put into hibernation.

4. handle_event[Processing Request]

1 To determine whether the connection is logged verification, if not, the login verification

2) Association THD Instance Information

3 Get network packet, analyze request

4 Call the Do_command function to loop the request

5 Gets the socket handle of the THD instance to determine if the handle is in the Epoll listening list

6 if not, call Epoll_ctl to associate

7) End

5.listener[Listener Thread]

1 call epoll_wait to monitor the socket associated with the group, blocking the wait

2 If request arrives, recover from blocking

3 determine whether to put in a normal or priority queue based on the priority level of the connection

4 Determine if the task in the queue is empty

5 If the queue is empty, listener is converted to a worker thread

6 If there is no active thread within the group, wake up a thread

Note: Here epoll_wait listens to all connected sockets within the group, and then the connections that are heard

Requests push to the queue, the worker thread gets the task from the queue, and executes.

6. timer_thread[Monitoring Thread]

1 if there is no listener thread, and there is no io_event event recently

2) Create a wake or create a worker thread

3 If the group has not processed the request for some time, and the queue has a request, then

4 indicates that the group has been stall, the thread is awakened or created

5 Check if there is a connection timeout

Note: The timer thread determines whether the group is in stall state by calling Check_stall, and by calling Timeout_check checks whether the client connection timed out.

7.tp_wait_begin[into wait state flow]

1) active_thread_count minus 1,waiting_thread_count plus 1

2) Set connection->waiting= true

3 If the active thread number is 0, and the task queue is not empty, or there is no listener thread, the

4) Wake up or create a thread

8.tp_wait_end[end wait state flow]

1) Set the waiting state of connection to False

2) active_thread_count plus 1,waiting_thread_count minus 1

Note:

1 Waiting_threads The thread in this list is an idle thread, not a waiting thread, the so-called idle thread is a thread that can handle the task at any time, while the waiting thread is a thread that cannot process the task because it waits for a lock, or waits for an IO operation.

2 The main role of Tp_wait_begin and Tp_wait_end is due to reporting status, even if updating active_thread_count and waiting_thread_count information.

9. Tp_init/tp_end

Call Thread_group_init and Thread_group_close individually to initialize and destroy the thread pool

thread pool and connection pool

Connection pooling is usually implemented on the client side, which means that the application (client) creates a predetermined connection that uses these connections to service all DB requests from the client. If a moment, the number of idle connections is less than the number of DB requests, you need to queue the request, waiting for the idle connection processing. Connection pooling allows you to reuse connections, avoid frequent creation and release of connections, reduce the average response time for requests, and, when requests are busy, queue by request to buffer the impact of the application on DB. The thread pool is implemented on the server side, by creating a certain number of thread service DB requests, the smallest unit of the thread pool service is the statement, that is, a single thread can correspond to multiple active connections, relative to a One-conection-per-thread thread service one connection. The thread pool can control the server-side service threads to a certain extent, reduce the competition of system resources and the consumption of thread context switching, and avoid high concurrency problems caused by high number of connections. Connection pooling and thread pooling complement each other, reducing the creation and release of connections through connection pooling, increase the average response time of the request and control the number of DB connections for an application, but cannot control the number of connections across the application cluster, resulting in a high number of connections and a good response to the high number of connections through the thread pool. Ensure that the server side can provide a stable service. As shown in Figure 2, each Web-server side maintains 3 connected connection pools, and each connection to the connection pool is not actually a worker for exclusive db-server, but may be shared with other connections. This assumes that Db-server has only 3 group, each group has only one worker, and each worker handles 2 connection requests.

Thread pool Optimization

1. Scheduling Deadlock resolution

The introduction of thread pooling solves the problem of multithreading high concurrency, but it also brings a hidden trouble. Suppose that the A,b two transactions are assigned to a different group for execution, a transaction has already started and holds a lock, but because a is in a group that is busier, causing a to execute a statement, it cannot get scheduled execution immediately, while B transaction relies on a transaction to release the lock resource, although B transactions can be dispatched, However, due to the inability to obtain lock resources, causing still need to wait, this is called the scheduling deadlock. Because a group handles multiple connections at the same time, multiple connections are not equivalent. For example, some connections are the first to send a request, while some of the corresponding transactions have been opened, and hold a partial lock resources. In order to reduce lock resource contention, the latter is clearly preferable to the former in order to release the lock resource as soon as possible. Therefore, within a group, you can add a priority queue, where a connection that already holds a lock, or a connection to a transaction that has already been opened, is placed in the priority queue, and the worker thread gets the task execution first from the priority queue.

2. Large query processing

Suppose a scene, a group inside the connection is a large query, then the number of worker threads within the group will soon reach the Thread_pool_oversubscribe parameter set value, for subsequent connection requests, will not respond in time (no more connections to deal with), At this time group stall happened. As you can see from the previous analysis, the timer thread periodically checks for this and creates a new worker thread to handle the request. If a long query originates from a business request, then all group faces this problem, at which point the host may be overloaded, causing the hang to live. The thread pool itself is powerless, because the source may be bad SQL concurrency, or the SQL does not go to the execution plan, through other methods, such as SQL High or low water level limit flow or SQL filtering means can be emergency processing. However, there is another case, the dump task. Many downstream raw data that relies on the database, typically pulled downstream via the dump command, which is usually time-consuming, and can be considered a large query. This is intolerable if the dump task is concentrated within a group and causes other normal business requests not to respond immediately, because the database is not under pressure at this time, but because the thread pool policy is used to cause the request response to be out of order, to solve this problem, We avoid this problem by excluding the threads in the group that handle the dump task from the Thread_pool_oversubscribe cumulative value.

One-connection-per-thread

Depending on the scheduler_functions template, we can also list several key functions of the one-connection-per-thread approach.

Static Scheduler_functions con_per_functions=
 
{max_connection+1,//Max_threads
 
null,
 
null,
 
NULL,/ /init
 
Init_new_connection_handler_thread,//Init_new_connection_thread
 
Create_thread_to_handle_ Connection,//Add_connection
 
NULL,//Thd_wait_begin
 
NULL,//Thd_wait_end
 
NULL,//Post_kill_ Notification
 
One_thread_per_connection_end,//End_thread
 
NULL//End
 
};

1.init_new_connection_handler_thread

This interface is simpler, mainly calling Pthread_detach, setting the thread to detach state, and automatically releasing all resources when the thread ends.

2.create_thread_to_handle_connection

This interface is the interface that handles the new connection, and for the thread pool, a thread is fetched from the thread_id%group_size corresponding group, and the One-connection-per-thread method determines whether there are thread_ The cache can be used, and if not, create a new thread to handle it. The specific logic is as follows:

(1). Determine if the number of threads in the cache is exhausted (compare blocked_pthread_count and Wake_pthread sizes)

(2). If there is a cache thread, add THD to the Waiting_thd_list queue and wake up a thread waiting for Cond_thread_cache

(3). If not, create a new thread handler, the thread's entry function is Do_handle_one_connection

(4). Call Add_global_thread to join the THD array.

3.do_handle_one_connection

This interface is create_thread_to_handle_connection called to handle the main implementation interface of the request.

(1). Loop call Do_command, read the network packet from the socket, and parse the execution;

(2). When a remote client sends a close connection command (such as Com_quit,com_shutdown), exit the loop

(3). Call Close_connection Close connection (Thd->disconnect ());

(4). Call the One_thread_per_connection_end function to confirm that the thread can be reused

(5). Depending on the return result, determine whether to exit the worker thread or continue to loop through the command.

4.one_thread_per_connection_end

The logic for determining whether the main function of a thread (Thread_cache) can be reused is as follows:

(1). Call Remove_global_thread and remove the THD instance of the thread

(2). Call block_until_new_connection to determine if thread can be reused

(3). To determine whether the cached thread exceeds the threshold, if not, then blocked_pthread_count++;

(4). Blocking Wait condition variable Cond_thread_cache

(5). After being awakened, indicates that there is a new THD need to reuse the thread, remove THD from waiting_thd_list, and use THD to initialize the thread's Thd->thread_stack

(6). Call Add_global_thread to join the THD array.

(7). If you can reuse, return false, otherwise return ture

Thread pool and Epoll

Before the thread pool is introduced, the server layer has only one listener thread, which listens for MySQL ports and local unixsocket requests, and assigns a separate thread for each new connection, so the thread-sniffing task is easier MySQL uses the poll or select method to achieve IO multiplexing. After the thread pool is introduced, each group has a listener thread that listens for all connection requests within the group, except for the listener threads in the server layer, and the worker thread is not responsible for listening and processing requests only. For thread pool settings for Overscribe 1000, each listener thread needs to listen for 1000 sockets, and the listener thread uses Epoll to monitor.

Select,poll,epoll are IO multiplexing mechanism, IO multiplexing through a mechanism, can listen to multiple FD (descriptor), such as socket, once a FD ready (read-Ready or write-ready), can inform the program to do the appropriate reading and writing operations. Epoll has greatly improved relative to select and poll, first epoll registers with the EPOLL_CTL function, registers all the FD in the kernel, copies only once does not need duplicate copies, but each time calls the poll or the Select, All need to copy the FD collection from the user space to the kernel space (Epoll through the epoll_wait); second, epoll specifies a callback function for each descriptor that, when the device is ready, wakes the waiting person, adds the descriptor to the ready list through the callback function, and does not need to be like Select, The poll method adopts the polling method, the last select defaults to support only 1024 fd,epoll, but no limit, the specific number can refer to the Cat/proc/sys/fs/file-max setting. Epoll throughout the process of using the thread pool, I describe how to use the Epoll thread in the Epoll creation, use, and destruction lifecycle.

Thread pool initialization, Epoll creates the Epoll file descriptor through the Epoll_create function, and realizes the function is thread_group_init;
Port listening line Cheng after hearing the request, create the socket and create the THD and connection objects in the corresponding group queue;
When the worker thread acquires the connection object, logon verification if it is not yet logged in
If the socket has not been registered to Epoll, then call EPOLL_CTL for registration, registration is Epoll_ctl_add, and connection object into the epoll_event structure body
If the old connection request, still need to call EPOLL_CTL registration, the registration way is Epoll_ctl_mod
The listener thread within the group calls Epoll_wait to listen for registered fd,epoll is a synchronous Io method, so it waits
When the request arrives, get the connection in the epoll_event structure and put into the queue in the group
When the thread pool is destroyed, the call Thread_group_close closes the epoll.
Note:

1. Register the FD in Epoll, if the request is ready, then put the corresponding event into the events array, and the FD transaction type empty, so for the old connection request, still need to call Epoll_ctl (POLLFD, Epoll_ctl_mod, FD, &ev) to register.

Thread pool function call relationship

(1) Create Epoll

Tp_init->thread_group_init->tp_set_threadpool_size->io_poll_create->epoll_create

(2) Close Epoll

Tp_end->thread_group_close->thread_group_destroy->close (POLLFD)

(3) Associative socket descriptor

Handle_event->start_io->io_poll_associate_fd->io_poll_start_read->epoll_ctl

(4) Processing connection requests

Handle_event->threadpool_process_request->do_command->dispatch_command->mysql_parse->mysql_ Execute_command

(5) When the worker thread is idle

Worker_main->get_event->pthread_cond_timedwait

After waiting for Thread_pool_idle_timeout, exit.

(6) Monitoring Epoll

Worker_main->get_event->listener->io_poll_wait->epoll_wait

(7) Port listener thread

Main->mysqld_main->handle_connections_sockets->poll

One-connection-per-thread function Call Relationship

(1) Worker threads wait for request

Handle_one_connection->do_handle_one_connection->do_command->
my_net_read->net_read_packet- >net_read_packet_header->net_read_raw_loop->
vio_read->vio_socket_io_wait->vio_io_wait-> Poll

Note: The worker thread with the thread pool has a listener thread to help its listener request, and the One-connection-per-thread-way worker thread calls the poll blocking wait for the network packet when it is idle;

The thread pool worker threads only need to concentrate on processing the request, so it is more fully used.

(2) Port listener thread
Same as the thread pool (7)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.