A detailed explanation of multithreaded programming in Python

Source: Internet
Author: User
First, Introduction

Multithreaded programming technology can realize code parallelism, optimize processing power, and the smaller division of function can make code reusability better. The threading and queue modules in Python can be used for multithreaded programming.
Second, detailed
1. Threads and processes
A process (sometimes called a heavyweight process) is a single execution of a program. Each process has its own address space, memory, data stack, and other secondary data that records its running trajectory. The operating system manages all the processes running on it and distributes the time fairly for those processes. Processes can also perform other tasks through fork and spawn operations, but each process has its own memory space, data stack, and so on, so it can only use interprocess communication (IPC) instead of sharing information directly.
Threads (sometimes referred to as lightweight processes) are somewhat similar to processes, unlike all threads that run in the same process and share the same running environment. They can be imagined as "mini-processes" running in parallel in the main process or "main thread". Line threads begins, executes and ends in three parts, it has its own instruction pointer, which records where it runs. The running of the thread may be preempted (interrupted) or temporarily suspended (also called sleep) for other threads to run, which is called concession. Each thread in a process shares the same piece of data space, so the threads can more easily share data and communicate with each other than between processes. Threads are generally executed concurrently, and it is because of this parallel and data sharing mechanism that the collaboration of multiple tasks becomes possible. In fact, in a single CPU system, real concurrency is not possible, each thread will be scheduled to run only a small time, and then the CPU out, let the other threads to run. Throughout the process, each thread only does its own thing, sharing the results of the run with other threads when needed. It is not entirely dangerous for multiple threads to access the same piece of data together, because of the different order of data access, it is possible to cause inconsistent data results, which is called race condition. Most line libraries have a series of synchronization primitives to control the execution of threads and access to data.
2. Using threads
(1) Global interpreter lock (GIL)
The execution of the Python code is controlled by the Python virtual machine (also known as the interpreter main loop). Python was designed to take into account the main loop, while only one thread was executing. Although the Python interpreter can "run" multiple threads, only one thread runs in the interpreter at any time.
Access to the Python virtual machine is controlled by the global Interpreter Lock (GIL), which ensures that only one thread is running at the same time. In a multithreaded environment, the Python virtual machine executes as follows: A, set gil;b, switch to a thread to run, C, run a specified number of bytecode instructions or thread to give control voluntarily (can call Time.sleep (0)); d, set the thread to sleep; E, unlock GIL ; d. Repeat all of the above steps again.
The Gil will be locked until the function ends (because no Python bytecode is running, so there is no thread switching) when calling external code (such as C + + extension functions), programmers who write extensions can actively unlock the Gil.
(2) Exit thread
When a thread ends the calculation, it exits. A thread can call an exit function such as thread.exit () or use Python to exit a process's standard method, such as sys.exit () or throwing a systemexit exception. However, it is not possible to "kill" a thread directly.
It is not recommended to use the thread module, but one obvious reason is that when the main thread exits, all other threads are not purged and exit. Another module threading to ensure that all "important" sub-threads are exited before the process ends.
(3) Python threading module
Python provides several modules for multithreaded programming, including thread, threading, and queue. The thread and threading modules allow programmers to create and manage threads. The thread module provides basic threading and lock support, and threading provides a higher-level, more functional thread-management capability. The queue module allows a user to create a data structure that can be used to share information between multiple threads.
Avoid using the thread module because the higher-level threading modules are more advanced, the support for threading is better, and the use of attributes in the thread module may conflict with the threading, followed by a low level of synchronization primitives for the thread module ( There is actually only one, and the threading module is a lot; Moreover, in the thread module, when the main thread ends, all threads are forced to end, no warnings and no normal cleanup work, at least the threading module ensures that the important child threads exit after the process exits.
3. Thread module
In addition to thread modules, the thread module also provides basic synchronization data structure lock objects (lock object is also called Yuri, simple lock, mutex, mutex, binary semaphore). Synchronization primitives are inseparable from the management of threads.
Common thread functions and methods for locking objects of type LockType:

  #!/usr/bin/env python      import thread from time   to import sleep, CTime      def loop0 ():     print ' +++start loop 0 at: ', CTime ()     sleep (4)     print ' +++loop 0 done on: ', CTime ()      def loop1 ():     print ' ***start loop 1 at: ', CTime () C9/>sleep (2)     print ' ***loop 1 done at: ', CTime ()      def main ():     print '------starting at: ', CTime ()     Thread.start_new_thread (Loop0, ())     Thread.start_new_thread (LOOP1, ())     sleep (6)     print '------all Done at: ', CTime ()      if __name__ = = ' __main__ ':     

The thread module provides a simple multi-threaded mechanism that two loops are executed concurrently, with the total run time of the slowest thread running (main thread 6s), rather than the sum of all threads running time. Start_new_thread () requires the first two parameters, even if you want to run the function does not have parameters, but also to pass an empty tuple.

Sleep (6) is the main thread that stops, and once the main thread runs, it shuts down and runs the other two threads. However, this can cause the main thread to exit prematurely or late, and the thread lock is used, and the main thread exits immediately after two child threads have exited.
View code slices to my Code slice

  #!/usr/bin/env python      import thread from time   import sleep, ctime      loops = [4, 2]      def loop (Nloop, Nsec, loc k):     print ' +++start loop: ', Nloop, ' at: ', CTime ()     sleep (nsec)     print ' +++loop: ', Nloop, ' do at: ', CTime () C8/>lock.release ()      def main ():     print '---starting threads ... '     locks = []     nloops = range (len (loops) ) for        i in Nloops:       lock = Thread.allocate_lock ()       lock.acquire ()       locks.append (lock) for        I in Nloops:       Thread.start_new_thread (Loop,          (i, loops[i], locks[i])) for        I in Nloops: while       locks[i]. Locked (): Pass        print '---all do at: ', CTime ()      if __name__ = = ' __main__ ':     

4. Threading Module
The higher-level threading module, which not only provides the thread class, but also provides a variety of very useful synchronization mechanisms. All objects in the threading module:

The thread module does not support daemon threads, and when the main thread exits, all child threads are forced to quit, regardless of whether they are still working. While the threading module supports the daemon thread, the daemon thread is typically a server waiting for a client request, and if no client requests it, it waits, if a thread is set to be a daemon, it means that the thread is unimportant, and the process exits without waiting for the thread to exit. If the main thread exits without waiting for those child threads to complete, set the Daemon property of these threads, which is called the Setdaemon () function to set the daemon flag of the thread (Thread.setdaemon (True) before the thread Thread.Start () begins. ) means that the thread is "unimportant". If you want to wait for the child thread to finish and then exit, do nothing or explicitly call Thread.setdaemon (false) to ensure its daemon flag is false, and you can call the Thread.isdaemon () function to determine the value of its daemon flag. The new child thread inherits the daemon flag of its parent thread, and the entire Python ends after all non-daemon threads are exited, that is, when no non-daemon threads exist in the process.
(1) Thread Class of threading
It has many functions that are not in the thread module, the function of the thread object:

Create an instance of thread and pass it to a function
View code slices to my Code slice

  #!/usr/bin/env python      import threading from time   import sleep, ctime      loops = [4, 2]      def loop (Nloop, nsec ):     print ' +++start loop: ', Nloop, ' at: ', CTime ()     sleep (nsec)     print ' +++loop: ', Nloop, ' do at: ', CTime () 
  def Main ():     print '---starting at: ', CTime ()     threads = []     nloops = range (len (loops)) for        I in Nloops :       t = Threading. Thread (Target=loop,       args= (i, loops[i]))       Threads.append (t) for        I in Nloops:      # Start Threads       Threads[i].start () for        i in Nloops:      # Wait for all       Threads[i].join ()    # threads to finish        print '---all do at: ', CTime ()      if __name__ = = ' __main__ ':     

The biggest difference between the

instantiating a thread (calling thread ()) and calling Thread.start_new_thread () is that the new thread does not start immediately. This is a useful synchronization feature when you create a thread object, but do not want to start running the thread immediately. After all the threads have been created, call the start () function together to start, instead of creating a startup one. And there is no need to manage a bunch of locks (assigning locks, acquiring locks, releasing locks, checking the state of locks, etc.), simply invoking the join () thread on each thread to wait for the end of the child thread. Join () can also set parameters for timeout, that is, the main thread waits until the timeout expires. Another important aspect of
join () is that it can be completely unused, and once the thread is started, it runs until the thread's function ends and exits. If the main thread has other things to do besides waiting for the threads to end, then it is not necessary to call join (), and the join () is called only at the end of the waiting thread.
Create an instance of thread that is passed to it a callable class object
[HTML] view plaincopy on code to see the snippet derived from my Code slice

  #!/usr/bin/env python      import threading from time   import sleep, ctime      loops = [4, 2]      class ThreadFunc (obj ECT):        def __init__ (self, func, args, name= "):       self.name = name       Self.func = func       Self.args = args        def __call__ (self):       apply (Self.func, Self.args)      def Loop (Nloop, nsec):     print ' Start loop ', Nloop, ' at: ' , CTime ()     sleep (nsec)     print ' Loop ', Nloop, ' done at: ', CTime ()      def Main (): "     print ' starting at: ', CTime ()     threads = []     nloops = range (len (loops)) for        i in Nloops:  # Create all threads       T = Threadi Ng. Thread (Target=threadfunc (Loop, (I, loops[i]), loop.__name__))       Threads.append (t) for        I in Nloops:  # Start All Threads       Threads[i].start () for        i in Nloops:  # Wait for completion       Threads[i].join ()        print ' All do at: ', CTime ()      if __name__ = = ' __main__ ':     

Another way

is similar to passing a function is to execute when a thread is created, passing an instance of a callable class for the thread to start, which is a more object-oriented approach to multithreaded programming. A class object can use the powerful functionality of a class in relation to one or several functions. When a new thread is created, the thread object invokes the ThreadFunc object, and a special function __call__ () is used. Since there are already parameters to use, it is not necessary to upload to the constructor of thread (). Because of a tuple with one parameter, use the Apply () function or use Self.res = Self.func (*self.args).
derive a subclass from thread and create an instance of this subclass
to view the code slices on the codes to derive from my code slice

  #!/usr/bin/env Python import threading from time import sleep, CTime loops = [4, 2] class MyThread (th Reading. Thread): Def __init__ (self, func, args, name= "): Threading.       Thread.__init__ (self) self.name = name Self.func = Func Self.args = args def getresult (self): Return Self.res def run (self): print ' starting ', Self.name, ' at: ', CTime () Self.res = apply (self.fun C, Self.args) print Self.name, ' finished at: ', CTime () def loop (Nloop, nsec): print ' Start loop ', Nloop, ' a T: ', CTime () sleep (nsec) print ' Loop ', Nloop, ' done in: ', CTime () def main (): print ' starting at: ', CTime () threads = [] Nloops = range (len (loops)) for i in nloops:t = MyThread (loop, (I, loops[i]), l oop.__name__) Threads.append (t) for I in Nloops:threads[i].start () for I in Nloops:threa Ds[i].join () print ' All do at: ', CTime () If __name__ == ' __main__ ': Main ()  

Subclass the thread class, the constructor of the Mythread subclass must first call the constructor of the base class, the Special function __call__ () in the subclass, the name is changed to run (). In the Mythread class, add some output information for debugging, save the code in the Mythread module, and import the class. In addition to using the Apply () function to run these functions, you can also save the results to the Self.res property of the implementation and create a new function GetResult () to get the results.
(2) Other functions in the threading module

5. Queue Module
Properties of the common Queue module:

The queue module can be used to communicate between threads, allowing data to be shared between threads. Queue solves the problem of producer-consumer, now create a queue that allows the producer thread to put the newly produced goods in for consumer threads to use. The time it takes for producers to produce goods cannot be predetermined, and the time for consumers to consume goods produced by producers is uncertain.
View code slices to my Code slice

  #!/usr/bin/env python from the random import randint from time import sleep from the queue import queue from MyThread Import MyThread def writeq (queue): print ' +++producing object for Q ... ', queue.put (' xxx ', 1) print "+++s Ize now: ", Queue.qsize () def READQ (queue): val = queue.get (1) print '---consumed object from Q ... size now ',  \ queue.qsize () def writer (queue, loops): For I in range (loops): Writeq (queue) sleep (Randint (1, 3) def reader (queue, loops): For I in range (loops): READQ (queue) sleep (Randint (2, 5)) Funcs =  [Writer, reader] Nfuncs = range (len (FUNCS)) def main (): Nloops = Randint (2, 5) Q = Queue (+) threads        = [] for i in nfuncs:t = MyThread (Funcs[i], (Q, nloops), \ funcs[i].__name__) threads.append (t) For I in Nfuncs:threads[i].start () for I in Nfuncs:threads[i].join () print ' ***all done ' If __name__ = = ' __main__ ': Main ()  

This implementation uses the queue object and the way to produce (and consume) the goods randomly. Producers and consumers run independently and concurrently, and they do not necessarily take turns (random number simulations). The Writeq () and READQ () functions are used to place objects in the queue and an object in the consumption queue, where the string ' xxx ' is used to represent the objects in the queue. The writer () function is to put an object in the queue one at a time, wait for a while and then do the same thing, make a specified number of times, this number is randomly generated by the script runtime. The reader () function does something similar, except that it is used to consume objects.
6. Thread-related modules
Multithreading-related standard library modules:

Iii. Summary
(1) A program to accomplish multiple tasks, consider using a single thread for each task, which is more clearly designed than a single-threaded program.
(2) Single-threaded programs are limited in program performance, especially in programs that have independent, run-time uncertainties, multiple tasks, and separate multiple tasks into multiple threads running faster than sequential operations. Because the Python interpreter is single-threaded, not all programs can benefit from multiple threads.
(3) If there is insufficient, please leave a message, thank you first!

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.