Detailed multithreaded programming in Python

Detailed multithreaded programming in Python _python

Last Update:2017-01-19 Source: Internet

Author: User

Tags mutex thread class

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, Introduction

Multi-threaded programming technology can realize code parallelism, optimize processing capabilities, and a smaller function can make the reusability of the code better. Python threading and queue modules can be used to implement multithreaded programming.
Second, detailed
1. Threads and processes
A process (sometimes called a heavyweight process) is an execution of a program. Each process has its own address space, memory, data stacks, and other secondary data that records its running trajectory. The operating system manages all processes running on it and distributes the time fairly to those processes. Processes can also accomplish other tasks through fork and spawn operations, but each process has its own memory space, data stack, and so on, so you can only use interprocess communication (IPC) instead of sharing information directly.
threads, sometimes referred to as lightweight processes, are somewhat similar to processes, with all threads running in the same process and sharing the same running environment. They can be imagined as "mini processes" running in parallel in the main process or the main thread. Line Chengyu begins, sequentially executes, and ends three parts, it has a pointer to its own, and records where it runs. The running of a thread may be preempted (interrupted) or temporarily suspended (also called sleep) to allow other threads to run, which is called a concession. Each thread in a process shares the same piece of data space, so threads can more easily share data and communicate with each other than processes. Threads are generally executed concurrently, precisely because of the parallel and data-sharing mechanisms that make it possible to collaborate on multiple tasks. In fact, in a single CPU system, real concurrency is impossible, and each thread is scheduled to run for only a minute at a time, then the CPU is let out and the other threads run. Throughout the process, each thread is doing its own thing, sharing the results of running with other threads when needed. Multiple threads accessing the same piece of data together is not completely dangerous, because the order of data access is not the same, it is possible to cause inconsistent data results, which is called the race condition. Most line threading have a sequence of synchronization primitives that control the execution of threads and access to data.
2, using the thread
(1) Global interpreter lock (GIL)
The execution of Python code is controlled by the Python virtual machine (also known as the Interpreter Master loop). Python was designed to take into account the main loop, while only one thread was executing. Although multiple threads can be "run" in the Python interpreter, only one thread at any given time runs in the interpreter.
Access to the Python virtual machine is controlled by the global Interpreter Lock (GIL), which ensures that only one thread is running at the same time. In a multithreaded environment, the Python virtual machine executes as follows: A, set gil;b, switch to a thread to run; C, run a specified number of byte code instructions or the thread actively gives out control (you can call Time.sleep (0)); d, set the thread to sleep; E, unlock GIL D, repeat all the above steps again.
When calling external code, such as the C + + extension function, the Gil will be locked until the end of the function (since no Python bytecode is run during this time, so no thread switching is possible). Programmers who write extensions can actively unlock the Gil.
(2) Exit thread
When a thread ends the calculation, it exits. A thread can invoke an exit function such as thread.exit (), or it can use the standard method of the Python exit process, such as sys.exit () or throw a Systemexit exception. However, a thread cannot be "killed" ("kill") directly.
The thread module is not recommended, and obviously one reason is that when the main thread exits, all other threads are exited without being cleared. Another module threading to ensure that all "important" child threads are exited before the process ends.
(3) Python's threading module
Python provides several modules for multithreaded programming, including thread, threading, and queue. The thread and threading modules allow programmers to create and manage threads. The thread module provides basic thread and lock support, and threading provides a higher level of functionality for more powerful threading management. The queue module allows users to create a queue data structure that can be used to share data between multiple threads.
Avoid using the thread module, because higher-level threading modules are more advanced, support for threading is more sophisticated, and using attributes in the thread module may conflict with threading, followed by low-level synchronization primitives for the thread module ( In fact, there is only one, and there are a lot of threading modules; Furthermore, when the main thread at the end of the thread module, all threads are forced to end, no warning, no normal cleanup, at least the threading module ensures that important child threads exit after the process exits.
3. Thread module
In addition to producing threads, the thread module also provides a basic synchronization data structure lock object (lock object also called Yuri, simple lock, mutex, mutex, binary semaphore). Synchronization primitives are inseparable from the management of threads.
Common thread functions and methods for locking objects of type LockType:

  #!/usr/bin/env python 
   
  import thread 
  from time import sleep, CTime 
   
  def loop0 (): 
    print ' +++start loop 0 at: ', CTime ( 
    4) 
    print ' +++loop 0 done in: ', CTime () 
   
  def loop1 (): 
    print ' ***start loop 1 at: ', CTime () C9/>sleep (2) 
    print ' ***loop 1 done in: ', CTime () 
   
  def main (): 
    print '------starting at: ', CTime () 
    Thread.start_new_thread (Loop0, ()) 
    Thread.start_new_thread (LOOP1, ()) Sleep 
    (6) 
    print '------all Done at: ', CTime () 
   
  if __name__ = = ' __main__ ': 
    Main ()

The thread module provides a simple multithreaded mechanism in which two loops are executed concurrently, the total running time is the slowest thread's running time (the main thread 6s), not the sum of the running time of all threads. Start_new_thread () requires the first two parameters, even if you want to run the function does not parameter, but also to pass an empty tuple.

Sleep (6) is the main thread to stop, once the main thread runs out, it shuts down running the other two threads. However, this may cause the main thread to be too early or late to exit, it is necessary to use a thread lock, you can exit after two child threads, the main thread immediately exit.
View a code slice from my Code chip

  #!/usr/bin/env python 
   
  import thread 
  from time import sleep, ctime 
   
  loops = [4, 2] 
   
  def loop (Nloop, Nsec, loc k): 
    print ' +++start loop: ', Nloop, ' at: ', CTime ( 
    nsec) 
    print ' +++loop: ', Nloop, ' do at: ', CTime () C8/>lock.release () 
   
  def main (): 
    print '---starting threads ... ' 
    locks = [] 
    nloops = range (len (loops) For 
   
    i in Nloops: 
      lock = Thread.allocate_lock () 
      lock.acquire () 
      locks.append (lock) for 
   
    I Nloops: 
      Thread.start_new_thread (Loop,  
        (i, loops[i], locks[i)) for I-in- 
   
    nloops: While 
      locks[i]. Locked (): Pass 
   
    print '---all do at: ', CTime () 
   
  if __name__ = = ' __main__ ': 
    Main ()

4. Threading Module
a higher-level threading module that not only provides the thread class, but also provides a variety of very useful synchronization mechanisms. All objects in the threading module:

The thread module does not support Daemons, and when the main thread exits, all child threads are forcibly evicted, regardless of whether they are still working. While the threading module supports the daemon, the daemon is generally a server waiting for the customer to request, if no customer requests it is waiting there, if set a thread as a daemon, it means that this thread is unimportant, when the process exits, do not wait for this thread to exit. If the main thread exits without waiting for those child threads to complete, set the Daemon property of these threads, that is, before the thread Thread.Start () is started, call the Setdaemon () function to set the daemon flag (Thread.setdaemon (True) means that the thread is "unimportant". If you want to wait for the child thread to complete and then exit, then do nothing or explicitly call Thread.setdaemon (false) to ensure that its daemon flag is false, and you can call the Thread.isdaemon () function to determine the value of its daemon flag. The new child thread inherits the daemon flag of its parent thread, and the entire Python ends after all the non-daemon threads have exited, that is, when no non-daemon threads exist in the process.
(1) Threading Thread class
It has a number of functions that are not in the thread module, functions of the thread object:

Create an instance of thread and pass it a function
View a code slice from my Code chip

 #!/usr/bin/env python import threading from time import sleep, CTime lo OPS = [4, 2] def loop (Nloop, nsec): print ' +++start loop: ', Nloop, ' at: ', CTime () sleep (nsec) prin T ' +++loop: ', Nloop, ' done in: ', CTime () def main (): print '---starting at: ', ctime () threads = [] n Loops = Range (len (loops)) for i in Nloops:t = threading. 
      Thread (Target=loop, args= (i, Loops[i)) Threads.append (t) for I in Nloops: # Start threads 
   
    Threads[i].start () for i in Nloops: # Wait for All Threads[i].join () # Threads to finish print '---all do at: ', CTime () if __name__ = = ' __main__ ': Main ()

The biggest difference between

instantiating a thread (calling thread ()) and calling Thread.start_new_thread () is that The new thread does not start immediately. This is a useful synchronization feature when creating thread objects, but don't want to start running threads at once. After all the threads have been created, the start () function is called together to start, instead of creating a startup one. And there is no need to manage a bunch of locks (assigning locks, acquiring locks, releasing locks, checking the state of a lock, etc.), simply calling the join () primary thread to the end of the child thread for each thread. Join () can also set parameters for timeout, that is, the main thread waits until the timeout. Another important aspect of
join () is that it can be completely free of calls, and once the thread has started, it will run until the thread's function ends and exits. If the main thread has other things to do besides waiting for the thread to end, it does not have to call join (), only call join () at the end of the waiting thread. The
creates an instance of thread and passes it to a callable class object
[HTML] View plaincopy the code to see that it is derived from my code slice

 #!/usr/bin/env python import threading from time import sleep, ctime loops = [ 4, 2] class ThreadFunc (object): Def __init__ (self, func, args, name= "): Self.name = name S Elf.func = Func Self.args = args def __call__ (self): Apply (Self.func, Self.args) def Loop (nl 
   
  OOP, nsec): print ' Start loop ', Nloop, ' at: ', CTime (nsec) print ' Loop ', Nloop, ' do at: ', CTime () def main (): print ' starting at: ', ctime () threads = [] Nloops = range (len (loops)) for I in Nloops: # Create all threads T = Threading. Thread (Target=threadfunc (Loop, (I, loops[i]), loop.__name__) Threads.append (t) for I in Nloops: # start All Threads Threads[i].start () for I-nloops: # Wait for Completion threads[i].join () p Rint ' all do at: ', CTime () if __name__ = = ' __main__ ': Main ()

Another method that is similar to passing a function is when creating a thread, passing an instance of the callable class to the thread when it is started. This is a more object-oriented approach to multithreaded programming. The powerful functionality of a class can be used in a class object relative to one or several functions. When a new thread is created, the thread object invokes the ThreadFunc object, and a special function __call__ () is used. Because you already have the parameters you want to use, you do not have to upload to the thread () constructor. Because there is a tuple of arguments, use the Apply () function or use Self.res = Self.func (*self.args).
derive a subclass from thread, and create an instance of this subclass
to view the code on the codepage derived from my Code slice

  #!/usr/bin/env Python import threading from time import sleep, CTime loops = [4, 2] class Myt Hread (Threading. Thread): Def __init__ (self, func, args, Name= '): Threading. Thread.__init__ (self) self.name = name Self.func = Func Self.args = args def getresult (self) : Return self.res def run (self): print ' starting ', Self.name, ' at: ', CTime () Self.res = APPL Y (Self.func, Self.args) print Self.name, ' finished at: ', CTime () def loop (Nloop, nsec): print ' Start lo Op ', Nloop, ' at: ', CTime (nsec) print ' Loop ', Nloop, ' do at: ', CTime () def main (): print ' s Tarting at: ', ctime () threads = [] Nloops = range (len (loops)) for i in nloops:t = Mythread (loop 
   
    , (I, loops[i]), loop.__name__) Threads.append (t) for I in Nloops:threads[i].start () For I in Nloops:threads[i].join () prinT ' all do at: ', CTime () if __name__ = = ' __main__ ': Main ()

Subclass of the thread class, the constructor of the Mythread subclass must first call the constructor of the base class, the Special function __call__ () in the subclass, the name is changed to run (). In the Mythread class, add some output information for debugging, save the code to the Mythread module, and import the class. In addition to using the Apply () function to run these functions, you can also save the results to the Self.res property of the implementation and create a new function GetResult () to get the results.
(2) Other functions in the threading module

5. Queue Module
properties of the common Queue module:

The queue module can be used to communicate between threads, allowing data to be shared among threads. Queue solves the problem of producer-consumer, now create a queue, let the producer thread put the newly produced goods into the consumer thread to use. The time taken by producers to produce goods cannot be predetermined, and the time for consumers to consume goods produced by producers is also uncertain.
View a code slice from my Code chip

  #!/usr/bin/env python from random import randint from time import sleep from queue import queue 
    Thread Import mythread def writeq (queue): print ' +++producing object for Q ... ', queue.put (' xxx ', 1) Print "+++size now:", Queue.qsize () def READQ (queue): val = queue.get (1) print '---consumed object from Q 
      .. size now ', \ queue.qsize () def writer (queue, loops): For I in range (loops): Writeq (queue) Sleep (Randint (1, 3)) def reader (queue, loops): For I in range (loops): READQ (queue) sleep (RA 
    Ndint (2, 5)) Funcs = [writer, reader] Nfuncs = range (len (FUNCS)) def main (): Nloops = Randint (2, 5) Q = Queue (threads = [] for i in nfuncs:t = Mythread (funcs[i), (Q, nloops), \ Funcs 
      [i].__name__) Threads.append (t) for I-Nfuncs:threads[i].start () for-I in Nfuncs: 
   Threads[i].join ()
    print ' ***all done ' if __name__ = = ' __main__ ': Main ()

This implementation uses the queue object and the way the goods are randomly produced (and consumed). Producers and consumers are independent of each other and run concurrently, and they may not necessarily be performed in turn (random number simulations). The Writeq () and READQ () functions are used to put objects into queues and consume an object in the queue, where the string ' xxx ' is used to represent the object in the queue. The writer () function is to put an object into the queue at once, wait for a while and then do the same thing, and do the same number of times, which is randomly generated by the script when it is run. The reader () function does something similar, except that it is used to consume objects.
6, Thread-related modules
multithreading-related standard library modules:

Third, summary
(1) A program to complete a number of tasks, you can consider each task using a thread, such a program in the design relative to the single thread to do all the procedures, clearer.
(2) Single-threaded programs are limited in program performance, especially in programs that are independent, uncertain, and multitasking, while separating multiple tasks into multiple threads runs faster than sequential operations. Because the Python interpreter is single-threaded, not all programs can benefit from multiple threads.
(3) If there is a shortage, please leave a message, thank you first!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More