Detailed explanation of multi-thread programming and Python multi-thread programming in python

Source: Internet
Author: User

Detailed explanation of multi-thread programming and Python multi-thread programming in python

I. Introduction

Multi-threaded programming technology can achieve code parallelism and optimize processing capabilities. At the same time, smaller division of functions can make the code more reusable. The threading and Queue modules in Python can be used to implement multi-threaded programming.
Ii. Details
1. threads and processes
A process (sometimes called a heavyweight process) is an execution of a program. Each process has its own address space, memory, data stack, and other auxiliary data that records its running track. The operating system manages all processes running on them and assigns time to these processes fairly. A process can also perform other tasks through fork and spawn operations. However, each process has its own memory space and data stack, so it can only use inter-process communication (IPC ), but cannot share information directly.
A thread (sometimes called a lightweight process) is similar to a process. The difference is that all threads run in the same process and share the same runtime environment. They can be imagined as a "Mini process" that runs in parallel in the main process or the "main thread ". A thread has three parts: Start, sequential execution, and end. It has its own command pointer, which records where it runs. Thread running may be preemptible (interrupted) or temporarily suspended (also called sleep) to run other threads. This is called concession. Each thread in a process shares the same piece of data space, so it is easier to share data and communicate with each other between threads than between processes. Threads are generally executed concurrently. This mechanism of parallel and Data Sharing makes cooperation between multiple tasks possible. In fact, in a single CPU system, real concurrency is impossible. Each thread is scheduled to run for only a short time each time, and then the CPU is sent out, let other threads run. During the entire process, each thread only performs its own tasks and shares the running results with other threads as needed. It is not completely dangerous for multiple threads to access the same piece of data. Because the data access sequence is different, it may cause inconsistent data results. This is called a race condition. Most thread libraries have a series of synchronization primitives to control thread execution and data access.
2. Use threads
(1) Global interpreter lock (GIL)
The execution of Python code is controlled by the Python Virtual Machine (also called the interpreter main loop. At the beginning of the design, Python should be considered in the main loop, and only one thread is executing at the same time. Although the Python interpreter can run multiple threads, there is only one thread running in the interpreter at any time.
Access to Python virtual machines is controlled by the global interpreter lock (GIL), which ensures that only one thread is running at the same time. In a multi-threaded environment, the Python virtual machine is executed as follows: a. Set GIL; B. Switch to a thread to run the command; c. Run a specified number of bytecode commands or threads to actively give up control (time can be called. sleep (0); d. Set the thread to sleep; e. Unlock GIL; d. Repeat all the preceding steps again.
GIL will be locked when calling external code (such as C/C ++ extensions, programmers who write extensions can actively unlock GIL until this function ends (because no Python bytecode is run during this period, so there is no thread switch.
(2) Exit the thread
When a thread finishes computing, it exits. A thread can call exit functions such as thread. exit () or use standard Python methods to exit a process, such as sys. exit () or throw a SystemExit exception. However, you cannot directly "kill" a thread.
We do not recommend using the thread module. One obvious reason is that when the main thread exits, all other threads exit without being cleared. Another module, threading, can ensure that all "important" child threads exit before the process ends.
(3) Python thread Module
Python provides several modules for multi-threaded programming, including thread, threading, and Queue. The thread and threading modules allow programmers to create and manage threads. The thread module provides basic thread and lock support, and threading provides a higher level and more powerful thread management function. The Queue module allows you to create a Queue data structure that can be used to share data between multiple threads.
Avoid using the thread module, because the higher level threading module is more advanced and the thread support is more complete, and the attributes in the thread module may conflict with threading; second, the low-level thread module has few synchronization primitives (in fact only one), while the threading module has many. Furthermore, when the main thread module ends, all threads are forcibly terminated, and normal cleanup is not performed without warning. At least the threading module can ensure that important sub-threads exit before exiting.
3. thread Module
In addition to thread generation, the thread module also provides basic synchronization Data Structure lock objects (lock objects are also called primitive locks, simple locks, mutex locks, mutex volumes, and binary semaphores ). Synchronization primitives are inseparable from thread management.
Common thread functions and LockType lock objects:

  #!/usr/bin/env python      import thread   from time import sleep, ctime      def loop0():     print '+++start loop 0 at:', ctime()     sleep(4)     print '+++loop 0 done at:', ctime()      def loop1():     print '***start loop 1 at:', ctime()     sleep(2)     print '***loop 1 done at:', ctime()      def main():     print '------starting at:', ctime()     thread.start_new_thread(loop0, ())     thread.start_new_thread(loop1, ())     sleep(6)     print '------all DONE at:', ctime()      if __name__ == '__main__':     main() 

The thread module provides a simple multi-thread mechanism, where two cycles are concurrently executed. The total running time is the slowest running time of the thread (the main thread is 6 s ), instead of the sum of running time of all threads. Start_new_thread () requires that you have the first two parameters. Even if you want to run a function without parameters, you need to pass an empty tuples.

Sleep (6) is used to stop the main thread. Once the main thread stops running, the other two threads are closed. However, this may cause the main thread to exit too early or too late, so the thread lock should be used. After both sub-threads exit, the main thread can exit immediately.
View the CODE piece derived from my CODE piece on CODE

  #!/usr/bin/env python      import thread   from time import sleep, ctime      loops = [4, 2]      def loop(nloop, nsec, lock):     print '+++start loop:', nloop, 'at:', ctime()     sleep(nsec)     print '+++loop:', nloop, 'done at:', ctime()     lock.release()      def main():     print '---starting threads...'     locks = []     nloops = range(len(loops))        for i in nloops:       lock = thread.allocate_lock()       lock.acquire()       locks.append(lock)        for i in nloops:       thread.start_new_thread(loop,          (i, loops[i], locks[i]))        for i in nloops:       while locks[i].locked(): pass        print '---all DONE at:', ctime()      if __name__ == '__main__':     main() 

4. threading Module
The higher level threading module not only provides Thread classes, but also provides a variety of very useful synchronization mechanisms. All objects in the threading module:

The thread module does not support daemon threads. When the main thread exits, all sub-threads will be forcibly exited, whether or not they are still working. The threading module supports the daemon thread. The daemon thread is generally a server waiting for customer requests. If no customer requests, the daemon thread is waiting. If a thread is set as the daemon thread, this indicates that this thread is not important. When the process exits, you do not have to wait for this thread to exit. If the main thread does not need to wait for the sub-threads to finish exiting, set the daemon attribute of these threads, that is, in the thread. before start (), call the setDaemon () function to set the daemon flag (thread. setDaemon (True) indicates that this thread is "unimportant ". If you want to wait for the sub-thread to finish and then exit, you do not need to do anything or explicitly call the thread. setDaemon (False) to ensure that its daemon flag is False, you can call thread. isDaemon () function to determine the value of its daemon flag. The new sub-thread inherits the daemon flag of its parent thread. The whole Python end after all non-daemon threads exit, that is, it ends only when no non-daemon thread exists in the process.
(1) threading Thread class
It has many functions not available in the thread module, such as the functions of the Thread object:

Create a Thread instance and pass it a function
View the CODE piece derived from my CODE piece on CODE

  #!/usr/bin/env python      import threading   from time import sleep, ctime      loops = [ 4, 2 ]      def loop(nloop, nsec):     print '+++start loop:', nloop, 'at:', ctime()     sleep(nsec)     print '+++loop:', nloop, 'done at:', ctime()      def main():     print '---starting at:', ctime()     threads = []     nloops = range(len(loops))        for i in nloops:       t = threading.Thread(target=loop,       args=(i, loops[i]))       threads.append(t)        for i in nloops:      # start threads       threads[i].start()        for i in nloops:      # wait for all       threads[i].join()    # threads to finish        print '---all DONE at:', ctime()      if __name__ == '__main__':     main() 

The biggest difference between instantiating a Thread (calling Thread () and calling thread. start_new_thread () is that the new Thread does not start immediately. This is a useful synchronization feature when you create a thread object but do not want to start running the thread immediately. After all the threads are created, call the start () function together to start, instead of creating one to start. In addition, you do not need to manage a bunch of locks (such as allocating locks, obtaining locks, Releasing locks, and checking the lock status), as long as you simply call join () for each thread () the main thread waits for the end of the subthread. You can also set the timeout parameter for join (), that is, the main thread waits until it times out.
Another important aspect of join () is that it does not need to be called at all. Once the thread starts, it will run until the thread's function ends and exits. If the main thread has other things to do besides waiting for the end of the thread, you do not need to call join (), and only call join () when waiting for the end of the thread ().
Create a Thread instance and pass it a callable Class Object
[Html] view plaincopy view CODE snippets derived from my CODE snippets on CODE

  #!/usr/bin/env python      import threading   from time import sleep, ctime      loops = [ 4, 2 ]      class ThreadFunc(object):        def __init__(self, func, args, name=''):       self.name = name       self.func = func       self.args = args        def __call__(self):       apply(self.func, self.args)      def loop(nloop, nsec):     print 'start loop', nloop, 'at:', ctime()     sleep(nsec)     print 'loop', nloop, 'done at:', ctime()      def main():     print 'starting at:', ctime()     threads = []     nloops = range(len(loops))        for i in nloops:  # create all threads       t = threading.Thread(target=ThreadFunc(loop, (i, loops[i]), loop.__name__))       threads.append(t)        for i in nloops:  # start all threads       threads[i].start()        for i in nloops:  # wait for completion       threads[i].join()        print 'all DONE at:', ctime()      if __name__ == '__main__':     main() 

Another method that is similar to passing a function is to pass an instance of the callable class to be executed when a thread is created, this is a more object-oriented method for multi-threaded programming. Compared to one or more functions, class objects can use the powerful functions of classes. When creating a new Thread, the Thread object will call the ThreadFunc object, and a special function _ call _ () will be used __(). Because you already have the required parameters, you do not need to upload them to the Thread () constructor. Because there is a parameter tuples, use the apply () function or use self. res = self. func (* self. args ).
A subclass is derived from Thread to create an instance of this subclass.
View the CODE piece derived from my CODE piece on CODE

  #!/usr/bin/env python      import threading   from time import sleep, ctime      loops = [ 4, 2 ]      class MyThread(threading.Thread):     def __init__(self, func, args, name=''):       threading.Thread.__init__(self)       self.name = name       self.func = func       self.args = args        def getResult(self):       return self.res        def run(self):       print 'starting', self.name, 'at:', ctime()       self.res = apply(self.func, self.args)       print self.name, 'finished at:', ctime()      def loop(nloop, nsec):     print 'start loop', nloop, 'at:', ctime()     sleep(nsec)     print 'loop', nloop, 'done at:', ctime()      def main():     print 'starting at:', ctime()     threads = []     nloops = range(len(loops))        for i in nloops:       t = MyThread(loop, (i, loops[i]),       loop.__name__)       threads.append(t)        for i in nloops:       threads[i].start()        for i in nloops:       threads[i].join()        print 'all DONE at:', ctime()      if __name__ == '__main__':     main() 

 

The constructor of the MyThread subclass must first call the constructor of the base class. The special function _ call _ () should be changed to run () in the subclass (). In the MyThread class, add some output information for debugging, save the code to the myThread module, and import the class. In addition to using the apply () function to run these functions, you can also save the results to the self. res attribute of the implementation and create a new function getResult () to get the results.
(2) other functions in the threading Module

5. Queue Module
Attributes of common Queue modules:

The Queue module can be used for inter-thread communication to share data between threads. Queue solves the problem of producer-consumer. Now we create a Queue for the producer thread to put the newly produced goods in for the consumer thread to use. The time taken by the producer to produce the goods cannot be determined in advance, and the time consumed by the consumer to consume the goods produced by the producer is also unknown.
View the CODE piece derived from my CODE piece on CODE

  #!/usr/bin/env python      from random import randint   from time import sleep   from Queue import Queue   from myThread import MyThread      def writeQ(queue):     print '+++producing object for Q...',     queue.put('xxx', 1)     print "+++size now:", queue.qsize()      def readQ(queue):     val = queue.get(1)     print '---consumed object from Q... size now', \       queue.qsize()      def writer(queue, loops):     for i in range(loops):       writeQ(queue)       sleep(randint(1, 3))      def reader(queue, loops):     for i in range(loops):       readQ(queue)       sleep(randint(2, 5))      funcs = [writer, reader]   nfuncs = range(len(funcs))      def main():     nloops = randint(2, 5)     q = Queue(32)        threads = []     for i in nfuncs:       t = MyThread(funcs[i], (q, nloops), \         funcs[i].__name__)       threads.append(t)        for i in nfuncs:       threads[i].start()        for i in nfuncs:       threads[i].join()        print '***all DONE'      if __name__ == '__main__':     main() 

This implementation uses the Queue object and the method of randomly producing (and consuming) goods. Producers and consumers run independently and concurrently. They do not necessarily take turns (random number simulation ). The writeQ () and readQ () functions are respectively used to put objects into the queue and consume an object in the queue. Here, the 'xxx' string is used to represent objects in the queue. The writer () function puts an object into the queue at a time, waits for a while, and then does the same thing. a specified number of times is performed, which is randomly generated when the script is running. The reader () function is similar, but it is used to consume objects.
6. Thread-related modules
Standard library modules related to multithreading:

Iii. Summary
(1) a program to complete multiple tasks can consider using one thread for each task. Such a program is clearer and clearer in terms of design than a single-thread program that does everything.
(2) Restrictions on the performance of a single-threaded program, especially in a program that is independent of each other, has unknown running time, and has multiple tasks, separating multiple tasks into multiple threads is faster than running them sequentially. Since the Python interpreter is single-threaded, not all programs can benefit from multithreading.
(3) if there are any deficiencies, please leave a message. Thank you first!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.