Implementation of simple thread pool in Linux

Source: Internet
Author: User

Implementation of simple thread pool in Linux

Technical Background of Thread Pool

In object-oriented programming, it takes a lot of time to create and destroy objects, because creating an object requires obtaining memory resources or other resources. Even more so in Java, virtual machines will try to track every object so that garbage collection can be performed after the object is destroyed. Therefore, one way to improve service program efficiency is to minimize the number of times objects are created and destroyed, especially the creation and destruction of resource-consuming objects. How to use existing objects to serve (more than one different task) is a key issue that needs to be solved. In fact, this is why some "pooled resources" technologies are generated. For example, the familiar database connection pool follows this idea. The thread pool technology introduced in this article is also in line with this idea.

At present, some famous large companies are particularly optimistic about this technology and have already applied it to their products. For example, IBM WebSphere and IONA Orbix 2000 are in SUN's Jini, Microsoft's MTS (Microsoft Transaction Server 2.0), COM +, and so on.

Do you want to apply this technology in the server program?

How does thread pool technology improve the performance of server programs?

I mentioned that server programs refer to programs that can accept client requests and process requests, not just network server programs that accept requests from network customers.

Multithreading technology mainly solves the problem of multiple threads in a processor unit. It can significantly reduce the idle time of the processor unit and increase the throughput of the processor unit. However, improper multi-threaded application will increase the processing time for a single task. Here is a simple example:

Assume that the time for completing a task on a server is T

Time when T1 was created

Task execution time in T2 thread, including the time required for Inter-Thread Synchronization

Time when the T3 thread is destroyed

Obviously T = T1 + T2 + T3. Note that this is an extremely simplified assumption.

We can see that T1 and T3 are the overhead of multithreading itself. We are eager to reduce the time consumed by T1 and T3, thus reducing T time. However, some threads do not notice this, So threads are frequently created or destroyed in the program, this leads to a considerable proportion of T1 and T3 in T (implemented in the traditional multi-threaded server model as follows: Once a request arrives, a new thread is created, the thread executes the task. After the task is completed, the thread exits. This is the "instant creation, instant destruction" policy. Although the thread creation time has been greatly shortened compared with the creation process, if the task submitted to the thread has a short execution time and the execution times are very frequent, the server will be in the state of creating and destroying threads continuously. This overhead cannot be ignored, especially when the thread execution time is very short .). Obviously this highlights the thread's weakness (T1, T3), rather than the advantage (concurrency ).

The thread pool technology focuses on how to shorten or adjust T1 and T3 time to improve the performance of server programs. It schedules T1 and T3 to the start and end time periods of the server program or some idle time periods respectively (after the application starts, a certain number of threads will be created immediately, in the idle queue. These threads are all in the blocking state. These threads only occupy a little memory and do not occupy the CPU. When the task arrives, the thread pool selects an idle thread and transmits the task to this thread for running. When all threads are processing tasks, the thread pool automatically creates a certain number of new threads for processing more tasks. After the execution task is completed, the thread does not exit, but continues to wait for the next task in the thread pool. When most threads are in the blocking status, the thread pool will automatically destroy some threads and Recycle System Resources). In this way, when the server program processes client requests, there will be no overhead of T1 and T3.

The thread pool not only adjusts the time periods generated by T1 and T3, but also significantly reduces the number of created threads. Let's look at another example:

Assume that a server processes 50000 requests a day, and each request requires a separate thread. We compare the total number of threads produced when the server that uses the thread pool technology and is not conducive to the thread pool technology processes these requests. In the thread pool, the number of threads is generally fixed, so the total number of threads generated will not exceed the number or upper limit of threads in the thread pool (hereinafter referred to as the thread pool size ), if the server does not use the thread pool to process these requests, the total number of threads is 50000. Generally, the thread pool size is much smaller than 50000. Therefore, the server program that uses the thread pool does not waste time processing to create 50000 requests, thus improving efficiency.

Implementation of simple Thread Pool

1. The thread pool, as its name implies, is to create multiple threads. The function pthread_creat () for creating threads should be the easiest to think. When a thread is created, a thread exits pthread_exit (). Before the thread exits, if the thread does not set the pthread_detach () attribute, it is clear that the thread resource pthread_join () needs to be recycled (). Of course, you may need to obtain the thread ID value pthread_self (). 2. The first step is to create a thread. At the beginning, the thread does not do anything. After initialization, wait. Wait is certainly not a while (1) function, because it consumes too much CPU resources. The easy-to-think-of-waiting process is to use the conditional variable wait pthread_cond_wait (). This function does two things. The first is to remove the mutex lock corresponding to the form parameter mutex, and then re-lock the lock, in order to put the task into a buffer of the task queue in the thread. After the task is put into the lock, the lock will not affect the right of other tasks to obtain the lock. Therefore, before calling this function, you will naturally think of adding a mutex lock. Initialize the mutex lock function pthread_mutex_init (), and uninitialize the mutex lock function pthread_mutex_destroy (). The lock function pthread_mutex_lock (), and the unlock function pthread_mutex_unlock, try to unlock pthread_mutex_trylock (). 3. After implementing the above two steps, the framework of a thread pool is initially set up. Of course it cannot be used, because the threads that actually do things are all waiting. Note that it should not be a timeout wait pthread_cond_timewait (). To enable a thread in the blocking state to do things, use a signal to wake it up pthread_cond_signal (). A function of "Hitting the bird" always wakes the bird when it is fired, but specifically, it seems that the queue is the first (the above mentioned pthread_cond_wait () function's waiting queue problem ). Of course, you can also think of the pthread_cond_broadcast () function of the "bird surprise group" function. A group of birds flew away, regardless of whether they were beaten. 4. With the above foundation, we will focus on the task section. Of course, the number of threads is limited. As mentioned above, it is a fixed number. Therefore, when a task is larger than the number of threads, queuing is inevitable. Therefore, a task queue is created. Each item in the queue represents a task. The simplest model of a node in a task queue is a void * (* callback_function) (void * arg) function that processes the task ). Pointer to the function. The parameter is a pointer and the return value is also a pointer. For specific functions and parameters, you need to write function definitions separately. Once the thread has not been called to process this task, it needs to be deleted from the task queue. The number of tasks in the task queue cannot be infinite, so it is also set to a fixed value slightly larger than the number of threads. 5. Dynamic thread creation: After a thread exits (it may exit when the task fails to be executed), the main thread must be able to detect it and then dynamically create a new thread, to keep the total number of threads in the thread pool unchanged. You can use pthread_join () to block and wait for the sub-thread resources to be recycled, but this means that the main thread cannot do other work in the blocking state. Therefore, you need to use the thread signal when the sub-thread ends, send a SIGUSR1 signal to the main thread with pthread_kill (). When the main thread receives this signal, it calls the registration function signal () or sigaction () function registration function to create a new thread. The following only lists the implementation of the Threadpool core. classes that encapsulate condition variables are not listed here.
// ThreadPool design void * thread_routine (void * args); class ThreadPool {friend void * thread_routine (void * args); private: // callback function type typedef void * (* callback_t) (void *); // The task struct task_t {callback_t run; // The void * args of the task callback function; // the parameter of the task function}; public: threadPool (int _ maxThreads = 36, unsigned int _ waitSeconds = 2 );~ ThreadPool (); // Add the task interface void addTask (callback_t run, void * args); private: void startTask (); private: Condition ready; // notification std: queue when the task is ready or the thread pool is destroyed
 
  
TaskQueue; // task queue unsigned int maxThreads; // The maximum number of threads allowed by the thread pool: unsigned int counter; // the current number of threads in the thread pool: unsigned int idle; // Number of Idle threads in the thread pool: unsigned int waitSeconds; // number of seconds that the thread can wait for bool quit; // thread pool destruction flag };
 

// Thread entry function // This is actually equivalent to a consumer thread that constantly consumes tasks (Execution tasks) void * thread_routine (void * args) {// set the sub-thread to the separation State, so that the main thread does not need jion pthread_detach (pthread_self (); printf ("* thread 0x % lx is starting... \ n ", (unsigned long) pthread_self (); ThreadPool * pool = (ThreadPool *) args; // wait for the arrival of the task, and then execute the task while (true) {bool timeout = false; pool-> ready. lock (); // when waiting, the idle thread has a + + pool-> idle; // The conditional variables in pool-> ready have three functions: // 1. wait Task arrival in the task queue // 2. wait for the thread pool to destroy the notification // 3. make sure that the thread can be destroyed (the thread exits) while (pool-> taskQueue. empty () & pool-> quit = false) {printf ("thread 0x % lx is waiting... \ n ", (unsigned long) pthread_self (); // wait for waitSeconds if (0! = Pool-> ready. timedwait (pool-> waitSeconds) {// If the wait time-out printf ("thread 0x % lx is wait timeout... \ n ", (unsigned long) pthread_self (); timeout = true; // break output loop, continue to run down, it will run to break at the following 1st if ;}} // when the condition is mature (when the wait is over) and the thread starts to execute the task or destroy the thread, it means that another idle thread is missing -- pool-> idle; // status 3. if the wait times out (generally the task queue is empty at this time) if (timeout = true & pool-> taskQueue. empty () {-- pool-> counter; // unlock and then jump out of the loop, directly destroy the thread (exit thread) pool-> ready. unlock (); break;} // Status 2. if the thread is destroyed and the task is completed, if (pool-> quit = true & pool-> taskQueue. empty () {-- pool-> counter; // if no thread exists, send a notification to the thread pool. // notify the thread pool, the pool has no threads. if (pool-> counter = 0) pool-> ready. signal (); // unlock and jump out of the loop pool-> ready. unlock (); break;} // status 1. if a task exists, execute the task if (! (Pool-> taskQueue. empty () {// retrieves tasks from the queue header for processing ThreadPool: task_t * t = pool-> taskQueue. front (); pool-> taskQueue. pop (); // it takes some time to execute the task // unlock so that other producers can continue to produce the task, and other consumers can also consume the task pool-> ready. unlock (); // Process Task t-> run (t-> args); delete t ;}// print the exit information after the loop exists, then destroy the thread printf ("thread 0x % lx is exiting... \ n ", (unsigned long) pthread_self (); pthread_exit (NULL );}

// AddTask function // Add a task function. Similar to a producer, tasks are constantly generated and mounted to the task queue, waiting for the consumer thread to consume void ThreadPool: addTask (callback_t run, void * args) {/** 1. generate a task and add it to the end of "task queue" **/task_t * newTask = new task_t {run, args}; ready. lock (); // use mutex to protect the shared variable taskQueue. push (newTask);/** 2. let the thread start executing the task **/startTask (); ready. unlock (); // unlock to start task execution}

// Void ThreadPool: startTask () {// if there is a waiting thread, wake up one of them and let it execute the task if (idle> 0) ready. signal (); // No waiting thread, and the current total number of threads has not reached the threshold, we need to create a new thread else if (counter <maxThreads) {pthread_t tid; pthread_create (& tid, NULL, thread_routine, this); ++ counter ;}}

// Destructor ThreadPool ::~ ThreadPool () {// if it has been called, return if (quit = true) return; ready. lock (); quit = true; if (counter> 0) {// send notifications to the waiting threads, the notification will be received, // then exit if (idle> 0) ready directly. broadcast (); // For threads that are executing tasks, they cannot receive these notifications, // they need to wait for them to finish the task while (counter> 0) ready. wait ();} ready. unlock ();}
Complete code can be found at my Github: https://github.com/Tachone/LinuxPorgDemo/tree/master/threadpool_C%2B%2B

Discussion on advanced Thread Pool

There are some problems with the simple thread pool. For example, if a large number of customers require the server to serve them, but the worker threads in the thread pool are limited, the server can only serve some customers, tasks submitted by other customers can only be processed in the task queue. Some system designers may be dissatisfied with this situation because they have strict response time requirements on the server program, so they may doubt the feasibility of the thread pool technology during system design, however, the thread pool has a corresponding solution. Adjusting the size of the optimized thread pool is a problem to be solved by the advanced thread pool. There are mainly the following solutions:

 

Solution 1: dynamically add worker threads

In some advanced thread pools, a dynamically changing number of worker threads is generally provided to adapt to sudden requests. When the number of requests decreases, the number of worker threads in the thread pool is gradually reduced. Of course, you can use an advanced method to increase the number of worker threads in batches, instead of creating a thread only when a request is sent. Batch creation is more effective. This solution should also limit the upper and lower limits of the number of worker threads in the thread pool. Otherwise, this flexible method becomes a wrong method or disaster, because frequent thread creation or a large number of threads generated in a short time will deviate from the original intention of using the thread pool-Reduce the number of thread creation times.

For example, TaskManager in Jini is a sophisticated thread pool manager that dynamically adds working threads. SQL Server uses a Single Process (Multi-Thread) system structure, 1024 Thread pools, and dynamic Thread allocation. The theoretical upper limit is 32767.

Solution 2: optimize the number of worker threads

If you do not want to apply a complex policy in the thread pool to ensure that the number of working threads meets the application requirements, you need to count the number of customer requests based on the statistical principle, for example, how many tasks are required to be processed within one second on average during peak hours, and a reasonable thread pool size is estimated based on the system's capacity and the customer's capacity. The size of the thread pool is really difficult to determine, so sometimes we simply use experience.

For example, the thread pool size in MTS is fixed to 100.

Solution 3: one server provides multiple thread pools

This solution is used in some complex system structures. In this way, different thread pools can be used for processing based on different tasks or task priorities.

For example, COM + uses multiple thread pools.

These three solutions have their own advantages and disadvantages. Different solutions or combinations of these three solutions may be used in different applications to solve actual problems.

 

Applicable scope and precautions of thread pool technology

Below are some thread pool application scopes I have summarized, which may be incomplete.

Application Scope of thread pool:

(1) A large number of threads are required to complete the task, and the task completion time is short. The thread pool technology is very suitable for WEB servers to complete such tasks as WEB page requests. Because a single task is small and the number of tasks is huge, you can imagine the number of clicks on a popular website. However, for long-time tasks, such as a Telnet connection request, the advantages of the thread pool are not obvious. Because the Telnet session time is much longer than the thread creation time.

(2) Applications with demanding performance requirements, such as requiring the server to quickly respond to customer requests.

(3) accept a large number of sudden requests, but the server will not generate a large number of thread applications. A large number of sudden client requests will generate a large number of threads without a thread pool. Although theoretically the maximum number of threads in most operating systems is not a problem, producing a large number of threads in a short time may limit the memory, and the error "OutOfMemory" appears.

Refer:

UNP

IBM documentation: http://www.ibm.com/developerworks/cn/java/l-threadPool/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.