Python multi-thread programming

Source: Internet
Author: User

1. Global interpreter locked

The Python virtual machine uses GIL (Global Interpreter Lock, which is locked by the Global Interpreter) to mutex the thread's access to shared resources. At present, it cannot take advantage of the multi-processor. Although the python interpreter can run multiple threads, at any time, no matter how many processors there are, there is always only one thread to execute. For I/O-intensive tasks, the use of threads is generally no problem, but for applications involving a large number of CPU computing, the use of threads to segment the work is not any benefit, it is best for users to use sub-processes and message transmission.

2. threading

The threading module of python provides Thread classes and various synchronization primitives for compiling multi-threaded programs.

2.1. Thread (target = None, name = None, args = (), kwargs = {})

This function creates a Thread instance. Target is a callable function. When the thread starts, the run () method calls this object. Name is the name of the Thread. The default format is 'thread-n. Args is the parameter tuples passed to the target function, and kwargs is the dictionary of the key parameter passed to the target function.

Thread instance t supports the following methods and attributes:

  • T. start () start thread, that is, call the run () method
  • T. run () can be redefined in the subclass of Thread.
  • T. join ([timeout]) blocks the thread in the current context until the thread that calls this method terminates or reaches the specified timeout (optional parameter ).
  • T. is_live () returns the activity status of the thread.
  • T. name thread name
  • T. ident thread identifier
  • T. daemon must be set before t. start. If this parameter is set to True, the main thread does not have to wait for the daemon thread to finish exiting. '

There are two ways to create a thread:

Import threadingimport timedef clock (nsec): whhile True: print 'now is % s' % time. ctime () time. sleep (nsec) t = threading. thread (target = clock, args = (5,) t. daemon = True # Set to daemon t. start ()

2. derive a subclass from Thread and create an instance of the subclass.

Import threadingimport timeclass ClockThread (threading. thread): def _ init _ (self, nsec): threading. thread. _ init _ (self) self. daemon = True # set as the daemon thread self. nsec = nsec def run (): while True: print 'now is s % '% time. ctime () time. sleep (self. nsec) t = ClockThread (5) t. start ()

The latter method is a bit more python.

Because the thread will be infinite loop, so setDaemonIsTrueIn this way, when the process ends, the thread will also be destroyed.

For example, there is a number of programs, one thread is counted from 1 to 9, and the other thread is counted from a to j. Each thread consumes 9 s, and 18 s for sequential execution.

import threadingimport timeclass CountThread(threading.Thread):    def __init__(self,func,name):        threading.Thread.__init__(self)        self.name=str(name)        self.func=func    def run(self):        apply(self.func)def numcount():    print threading.currentThread().name,'start at : ',time.ctime()    for i in range(10):        print i        time.sleep(1)    print threading.currentThread().name,'done at : ',time.ctime()def alphacount():    print threading.currentThread().name,'start at : ',time.ctime()    for i in range(97,107):        print chr(i)        time.sleep(1)    print threading.currentThread().getName(),'done at : ',time.ctime()def main():    funclist=[numcount,alphacount]    threads=[]    for i in funclist:        t=CountThread(i,i.__name__)        threads.append(t)    for t in threads:        t.start()    for t in threads:        t.join()    print 'All done at :',time.ctime()if __name__=='__main__':    main()

Result:

numcount  start at :    Fri Feb 07 12:19:28alphacount  start at : Fri Feb 07 12:19:28 2014a0b12c3d4 e5f6g7 h8 i9jalphacount numcount  done at :  done at : Fri Feb 07 12:19:38 2014 Fri Feb 07 12:19:38 2014All done at : Fri Feb 07 12:19:38 2014

10 s.

Let's take a clearer example of the effect of t. join:

Import threadingimport timedef join (): print 'in threadjoin' time. sleep (1) print 'out threadjoin' Threadjoin = threading. thread (target = join, name = 'threadjoin') def context (Threadjoin): print 'in threadcontext' Threadjoin. start () Threadjoin. join () # The Threadjoin thread starts to block. Wait until Threadjoin finishes print 'out threadcontext' Threadcontext = threading. thread (target = context, name = 'threadcontext', args = (Threadjoin,) Threadcontext. start ()

Result:

>>> in Threadcontextin Threadjoinout Threadjoinout Threadcontext
2.2. Thread Synchronization

A thread runs within the process where it is created and shares all the data and resources, but has its own independent stack and heap. The difficulty in writing concurrent programming lies in synchronizing and accessing shared data. Updating a Data Structure for multiple tasks at the same time may result in inconsistent data corruption and program status (that is, competition conditions ). To solve this problem, you must find the key code segment of the program and use mutex locks and other similar synchronization techniques to protect them.

2.2.1 Lock

Primitive lock (mutex lock) is a synchronization primitive in the "locked" or "not locked" state. Two methods, acquire () and release (), are used to modify the locked status. If multiple threads are waiting to get the lock, only one thread can get it when the lock is released.

Constructor:
Lock (): Creates a new Lock object. The initial status is unlocked.

Instance method:
Lock. acquire ([timeout]): enables the thread to enter the synchronous blocking state and tries to get the Lock. If the lock is obtained successfully, True is returned. If the lock fails to be obtained, False is returned.
Lock. release (): release the Lock. The thread must be locked before use; otherwise, an exception will be thrown.

Python multi-threaded multipart reading of large files:

Import threadingimport osseekposition = 0 blocksize = 1000000 filesize = 0def getFilesize (filename): f = open (filename) f. seek (0, OS. SSEK_END) filesize = f. tell () f. close () return filesizedef parsefile (filename): global seekposition, filesize f = open (filename) while True: lock. acquire () # seekposition is shared by threads. When modifying this parameter, you must lock startposition = seekposition endposition = (startposition + blocksize) if (startposition + blocksize) <filesize else filesize seekposition = endposition lock. release () if startposition = filesize: break elif startposition> 0: f. seek (startposition) f. readline () # The first row of the divided block may not be a complete row. If it is omitted, it will not be processed, but it will be processed as the last row of the previous block. position = f. tell () outfile1_open(str(endposition={'.txt ', 'w') while position <= endposition: line = f. readline () outfile. write (line) position = f. tell () outfile. close () f. close () def main (filename): global seekposition, filesize = getFilesize (filename) lock = threading. lock () threads = [] for I in range (4): t = threading. thread (target = parsefile, args = (filename,) threads. append (t) for t in threads: t. start () for t in threads: t. join () if _ name __= = '_ main _': filename = ''main (filename)

2.2.2 RLock

Multi-Lock is a synchronization primitive similar to the Lock object, but the same thread can obtain it multiple times. This allows locked threads to execute nested acquire () and release () operations. It can be considered that RLock contains a lock pool and a counter whose initial value is 0. Each time acquire ()/release () is successfully called, the counter will be + 1/-1, when the value is 0, the lock is not locked.

import threadingimport timerlock=threading.RLock()count=0class MyThread(threading.Thread):    def __init__(self):        threading.Thread.__init__(self)    def run(self):        global count        if rlock.acquire():            count+=1            print '%s set count : %d'%(self.name,count)            time.sleep(1)            if rlock.acquire():                count+=1                print '%s set count : %d'%(self.name,count)                time.sleep(1)                rlock.release()            rlock.release()if __name__=='__main__':    for i in range(5):        t=MyThread()        t.start()

2.2.3 semaphores Semaphore

Semaphores are counter-based synchronization primitives. When the acquire () method is called, this counter minus 1. When the release () method is called, this counter is incremented by 1. if the counter is 0, the acquire () method will be blocked until other threads call release.

The following is a good example of semaphore reference from the http://www.cnblogs.com/huxi/archive/2010/06/26/1765808.html

Import threadingimport time semaphore = threading. semaphore (2) # The Initial counter value is 2 def func (): # request Semaphore. After the request is successful, the counter-1; when the counter is 0, blocking print '% s acquire semaphore... '% threading. currentThread (). getName () if semaphore. acquire (): print '% s get semaphore' % threading. currentThread (). getName () time. sleep (4) # release Semaphore, counter + 1 print '% s release semaphore' % threading. currentThread (). getName () semaphore. release () t1 = threading. thread (target = func) t2 = threading. thread (target = func) t3 = threading. thread (target = func) t4 = threading. thread (target = func) t1.start () t2.start () t3.start () t4.start () time. sleep (2) # You can call release if the main thread that does not obtain semaphore # If BoundedSemaphore is used, t4 will throw an exception print 'mainthread release semaphore without acquire 'semaphore. release ()

2.2.4 Condition

Conditional variables are synchronization primitives built on another lock. A typical usage is a producer-user problem. data produced by one thread is used by another thread.

Constructor:
Condition ([lock/rlock])

Instance method:
Acquire ([timeout])/release (): Call the corresponding method of the associated lock.
Wait ([timeout]): calling this method will enable the thread to enter the Condition wait pool for notification and release the lock until another thread executes y () or yy_all () on the Condition variable () method to wake it up. The thread must be locked before use; otherwise, an exception will be thrown.
Y (): Call this method to select a thread from the wait pool and notify the thread that receives the notification to automatically call acquire () to try to get the lock (enter the lock pool ); other threads are still waiting in the pool. The lock will not be released when this method is called. The thread must be locked before use; otherwise, an exception will be thrown.
Yy_all (): Wake up all threads waiting for this condition.

import threadingcv=threading.Condition()alist=[]def producer():    global alist        cv.acquire()    for i in range(10):        alist.append(i)    cv.notify()    cv.release()    def consumer():    cv.acquire()    while alist is None:        cv.wait()    cv.release()    print alisttproducer = threading.Thread(target=producer)tconsumer = threading.Thread(target=consumer)tconsumer.start()tproducer.start()

2.3 local ()

Returns the local object, which is used to save thread data and manage thread-local data. For the same local, the thread cannot access the attributes set by other threads. The attributes set by the thread are not replaced by the attributes set by other threads with the same name. We can regard local as a "thread-attribute Dictionary" dictionary, local encapsulates the use of the thread as the key to retrieve the corresponding attribute dictionary, and then use the attribute name as the key to retrieve the details of the attribute value.

Import threading mydata = threading. local () mydata. number = 42mydata. color = 'red' print mydata. _ dict _ log = [] def foo (): items = mydata. _ dict __. items () # In this thread, the mydata attribute dictionary is empty, and there is no number or color attribute items. sort () log. append (items) mydata. number = 11 log. append (mydata. number) t = threading. thread (target = foo) t. start () t. join () print logprint mydata. number # still 42
3. Queue

Although you can use a variety of lock and synchronization primitives in Python to write a very traditional multi-threaded program, there is a better programming method-that is, the multi-threaded program is organized into a set of multiple independent tasks, these threads communicate with each other through message queues. The Queue module can be used for inter-thread communication to allow various threads to share data.

Constructor:

Queue (): Creates a FIFO Queue.
(): Create a LIFO Stack

Instance method:
Q. put (item): put the item in the queue
Q. get (): delete an item from the queue and return this item.
Q. task_done (): The user of data in the queue is used to indicate that the project processing has ended. Each item deleted from the queue should be called once.
Q. join (): Blocking until all items in the queue are processed.

The core programming of python involves the combination of multi-thread programming and Queue:

  • UserThread: reads customer input, which may be an I/O channel. The program can create multiple threads, one for each customer, and the input is placed in the queue.
  • ReplyThread: obtains user input.
import threadingimport Queueq=Queue.Queue()class MyThread(threading.Thread):    def __init__(self):        threading.Thread.__init__(self)        self.daemon=True    def run(self):        while True:            item=q.get()            print threading.current_thread().name,'get',item            q.task_done()for i in range(4):    t=MyThread()    t.start()for i in range(100):    q.put(i)q.join()
4. Thread termination and suspension

The following is the Python reference manual.

The thread does not have any way to force termination or suspension. In terms of design, if a thread gets a lock, force termination of the thread before it is released will lead to a deadlock in the entire application.

You can build the termination function on your own:

Import threadingclass MyThread (threading. thread): def _ init _ (self): threading. thread. _ init _ (self) self. _ terminate = False # Set the termination flag self. lock = threading. lock () def terminal (self): self. _ terminal = True def acquire (self): self. lock. acquire () def release (self): self. lock. release () def run (self): while True: if self. _ terminal: # If the flag is True, terminate the thread break self. lock. acquire () statements self. lock. release () statements

You can also use Queue to transmit the termination signal.

import threadingimport Queueclass MyThread(threading.Thread):    def __init__(self):        threading.Thread.__init__(self)        self.queue=Queue.Queue()    def send(self,item):        self.queue.put(item)    def close(self):        self.queue.put(None)        self.queue.join()    def run(self):        while True:            item=self.queue.get()            if item is None:                break            print item            self.queue.task_done()        self.queue.task_done()t=MyThread()t.start()t.send('hello')t.send('world')t.close()

  

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.