MySQL (7) ----------- MySQL thread pool summary (1) _ MySQL

Source: Internet
Author: User
The thread pool is a core function of Mysql56. for server applications, whether it is web application services or DB services, high-concurrency requests are always a topic. When there are a large number of concurrent requests for access, it must be accompanied by the constant creation and release of resources, resulting in low resource utilization. the thread pool is a core function of Mysql5.6. for server applications, whether it's a web application service or a DB service, high-concurrency requests are always a topic. When there are a large number of concurrent requests for access, resources must be created and released continuously, resulting in low resource utilization and reduced service quality. A thread pool is a common technology. by creating a certain number of threads in advance, the thread pool allocates a thread to provide services when requests arrive. after the request ends, this thread serves other requests again. In this way, frequent creation and release of threads and memory objects are avoided, server concurrency is reduced, context switching and resource competition are reduced, and resource utilization efficiency is improved. The thread pool of all services is essentially a bit to improve resource utilization efficiency, and the implementation method is also roughly the same. This article describes the implementation principle of the Mysql thread pool.

Before Mysql5.6 appeared, Mysql handled the Connection by One-Connection-Per-Thread, that is, for each database Connection, Mysql-Server will create an independent Thread service. after the request ends, destroys a thread. If you have another connection request, create another connection and then destroy it. This method causes frequent thread creation and release in the case of high concurrency. Of course, with thread-cache, we can cache the thread for the next use to avoid frequent creation and release, but it cannot solve the problem of high connections. The One-Connection-Per-Thread mode requires the creation of the same number of service threads as the number of connections increases. high-concurrency threads mean high memory consumption and more context switches (lower cpu cache hit rate) and more resource competition, resulting in service jitter. Compared with the One-Thread-Per-Connection mode, a Thread corresponds to a Connection. in the Thread-Pool implementation mode, the minimum unit of Thread processing is statement (statement ), A thread can process multiple connection requests. In this way, when the hardware resources are fully utilized (the thread pool size is reasonably set), server jitter caused by an instant increase in the number of connections can be avoided.

Scheduling implementation

Mysql-Server supports three Connection management methods, including No-Threads, One-Thread-Per-Connection, and Pool-Threads. No-Threads indicates that the main Thread is used to process the Connection, and No additional Threads are created. this method is mainly used for debugging. One-Thread-Per-Connection is the most commonly used method before the Thread pool appears, create a thread service for each connection; Pool-Threads is the thread Pool method discussed in this article. Mysql-Server supports three connection management methods at the same time through a set of function pointers. for specific methods, the function pointer is set to a specific callback function, and the connection management mode is controlled by the thread_handling parameter, the code is as follows:

if (thread_handling <= SCHEDULER_ONE_THREAD_PER_CONNECTION)     one_thread_per_connection_scheduler(thread_scheduler,                                      &max_connections,                                      &connection_count);else if (thread_handling == SCHEDULER_NO_THREADS)  one_thread_scheduler(thread_scheduler);else                                   pool_of_threads_scheduler(thread_scheduler, &max_connections,&connection_count);
The connection management process listens to the connection request of the mysql port through poll. after receiving the connection, it calls the accept interface, creates a communication socket, initializes the thd instance, and sets the vio object according to the thread_handling method, initialize the schedconnection function pointer of the thd instance and call the add_connection function specified by scheduler to create a connection.

The following code demonstrates the implementation of the template callback function by the scheduler_functions template and thread pool. this is the core of multiple connection management.

struct scheduler_functions { uint max_threads;uint *connection_count; ulong *max_connections; bool (*init)(void); bool (*init_new_connection_thread)(void);void (*add_connection)(THD *thd);void (*thd_wait_begin)(THD *thd, int wait_type);void (*thd_wait_end)(THD *thd);void (*post_kill_notification)(THD *thd);bool (*end_thread)(THD *thd, bool cache_thread);void (*end)(void);};
static scheduler_functions tp_scheduler_functions={ 0, // max_threadsNULL,NULL, tp_init, // initNULL, // init_new_connection_threadtp_add_connection, // add_connectiontp_wait_begin, // thd_wait_begin tp_wait_end, // thd_wait_endtp_post_kill_notification, // post_kill_notification NULL, // end_threadtp_end // end};
Thread_handling: indicates the thread pool model. Thread_pool_size: the number of groups in the thread pool. it is generally set to the current number of CPU cores. Ideally, a group has an active working thread to make full use of the CPU. Thread_pool_stall_limit: used by the timer thread to regularly check whether the group is "stuck". The parameter indicates the interval of detection. Thread_pool_idle_timeout: when a worker is idle for a period of time, it automatically exits, ensuring that the worker threads in the thread pool keep a low level when they meet the request. Thread_pool_oversubscribe: this parameter is used to control the number of threads with "overclock" on the CPU core. This parameter value does not include the listen thread count. Threadpool_high_prio_mode: indicates the mode of the priority queue. Thread pool implementation

The preceding section describes how Mysql-Server manages connections. This section describes the implementation framework of the thread pool and key interfaces. 1

Figure 1 (thread pool framework)

Each green box represents a group, and the number of groups is determined by the thread_pool_size parameter. Each group contains a priority queue and a common queue, including a listener thread and several worker threads. the listener thread and worker thread can be dynamically converted. the number of worker threads is determined by the workload, it is also affected by the thread_pool_oversubscribe settings. In addition, the entire thread pool has a timer thread monitoring group to prevent the group from being "stuck ".

Key interfaces

1. tp_add_connection [process new connection]

1) create a connection object

2) determine the group to which the connection is allocated based on thread_id % group_count.

3) put the connection into the queue of the corresponding group

4) If the number of active threads is 0, a working thread is created.

2. worker_main [working thread]

1) Call get_event to obtain the request

2) If a request exists, handle_event is called for processing.

3) Otherwise, it indicates that there is no request in the queue and the exit ends.

3. get_event [GET request]

1) get a connection request

2) If yes, return immediately and end

3) If there is no listener in the group at this time, the thread is converted to the listener thread, blocking the wait

4) If listener exists, the thread is added to the waiting queue header.

5) the specified thread sleep time (thread_pool_idle_timeout)

6) If the thread is still not woken up and timed out, the thread ends and exits.

7) Otherwise, a connection request in the queue will arrive and jump to 1.

Note: before obtaining a connection request, the system checks whether the number of active threads has exceeded.

Thread_pool_oversubscribe + 1. if it is exceeded, the thread enters the sleep state.

4. handle_event [processing requests]

1) Check whether the connection is verified. if not, perform logon verification.

2) associate thd instance information

3) obtain network data packets and analyze requests

4) call the do_command function to process requests cyclically.

5) obtain the socket handle of the thd instance and determine whether the handle is in the epoll listener list.

6) If no, call epoll_ctl for association.

7) end

5. listener [listening thread]

1) Call epoll_wait to listen on the socket associated with the group, blocking wait

2) if the request arrives, it will be restored from blocking.

3) based on the priority of the connection, determine whether to put it into a common queue or a priority queue.

4) Check whether the tasks in the queue are empty.

5) If the queue is empty, the listener is converted to the worker thread.

6) If there is no active thread in the group, wake up a thread.

Note: epoll_wait listens to all connected sockets in the group, and then listens to the connections

Push the request to the queue. the worker thread obtains the task from the queue and then executes the task.

6. timer_thread [monitoring thread]

1) If there is no listener thread and there is no io_event event recently

2) create a wake-up or a working thread.

3) if the group has not processed the request in the recent period and there are requests in the queue

4) indicates that the group has been stall, then the thread is awakened or created

5) Check for connection timeout

Note: The timer thread checks whether the group is in the stall state by calling check_stall, and checks whether the client connection times out by calling timeout_check.

7. tp_wait_begin [enters the waiting state process]

1) active_thread_count minus 1, waiting_thread_count Plus 1

2) set connection-> waiting = true

3) if the number of active threads is 0 and the task queue is not empty or there is no listening thread

4) wake up or create a thread

8. tp_wait_end [end wait state process]

1) set the waiting status of connection to false.

2) active_thread_count plus 1, waiting_thread_count minus 1

Note:

1) the threads in the waiting_threads list are idle threads, not waiting threads. The so-called idle threads are threads that can process tasks at any time, while waiting threads are waiting for the lock, or wait for io operations and other threads that cannot process tasks.

2) the main function of tp_wait_begin and tp_wait_end is to report the status even if information about active_thread_count and waiting_thread_count is updated.

9. tp_init/tp_end

Call thread_group_init and thread_group_close respectively to initialize and destroy the thread pool.

Thread pool and connection pool

The connection pool is usually implemented on the Client side, which means the application (Client) creates a certain number of connections in advance and uses these connections to serve all the DB requests of the Client. If the number of idle connections is smaller than the number of DB requests at a certain time point, requests need to be queued and waiting for idle connections to be processed. You can reuse connections through the connection pool to avoid frequent connection creation and release, thus reducing the average response time of requests. when requests are busy, the impact of applications on the database can be buffered through request queuing. The thread pool is implemented on the server side. by creating a certain number of thread service DB requests, the thread service is connected to one-conection-per-thread, the minimum unit of the thread pool service is a statement, that is, a thread can correspond to multiple active connections. Through the thread pool, the number of service threads on the server can be controlled within a certain range, reducing the competition for system resources and the consumption of thread context switching, it also avoids high concurrency problems caused by high connections. The connection pool and thread pool complement each other. the connection pool can reduce the creation and release of connections, increase the average request response time, and control the number of DB connections of an application, however, the number of connections of the entire application cluster cannot be controlled, resulting in a high number of connections. the thread pool can effectively cope with the high number of connections and ensure the server can provide stable services. As shown in Figure 2, each web-server maintains three connection pools. each connection in the connection pool is not an exclusive worker of db-server, but may be shared with other connections. Assume that db-server has only three groups, each group has only one worker, and each worker processes two connection requests.

Figure 2 (Framework of connection pool and thread pool)

Thread pool optimization

1. Solve the scheduling deadlock

The introduction of the thread pool solves the problem of multi-thread high concurrency, but it also brings a hidden risk. Assume that transactions A and B are allocated to different groups for execution. transaction A has started and held the lock. However, because the group where A is located is busy, as A result, after executing A statement, A cannot get the scheduled execution immediately. transaction B depends on transaction A to release the lock resource. although transaction B can be scheduled, it cannot obtain the lock resource, as a result, you still need to wait. this is called a scheduling deadlock. Because a group processes multiple connections at the same time, multiple connections are not equal. For example, some connections send requests for the first time, while some connections have enabled transactions and hold some lock resources. In order to reduce lock resource contention, the latter should obviously take precedence over the former to release lock resources as soon as possible. Therefore, you can add a priority queue in the group to put requests initiated by connections that hold locks or connections that have enabled transactions into the priority queue, the worker thread first obtains the task execution from the priority queue.

2. big query processing

In a scenario where connections in a group are large queries, the number of worker threads in the group will soon reach the value set by the thread_pool_oversubscribe parameter. for subsequent connection requests, the response is not timely (no more connections are available), and stall occurs in the group. According to the previous analysis, the timer thread regularly checks this situation and creates a new worker thread to process requests. If a long query comes from a service request, all groups are faced with this problem. in this case, the host may be overloaded, resulting in hang. In this case, the thread pool itself is powerless, because the source may be bad SQL concurrency, or the SQL does not follow the execution plan, through other methods, for example, SQL high/low water level throttling or SQL filtering can be used for emergency handling. However, another case is the dump task. Many downstream databases rely on the original data of the database. Generally, the data is pulled to the downstream using the dump command. this dump task usually takes a long time, so it can be considered as a large query. If a dump task is concentrated in a group and other normal business requests cannot respond immediately, this is intolerable because the database has no pressure at this time, but the thread pool policy is adopted, the request response is not timely. to solve this problem, we will not include the threads processing dump tasks in the group into the cumulative value of thread_pool_oversubscribe to avoid the above problem.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.