The way of the Python-thread, process, co-path

Source: Internet
Author: User
Tags mutex semaphore

1. Thread Lock

If you do not control the access of multiple threads to the same resource, the data is corrupted, making the results of the thread run unpredictable. So we have to introduce the thread lock.

Thread synchronization ensures that multiple threads secure access to competing resources, and the simplest synchronization mechanism is to introduce mutexes . A mutex introduces a state to a resource: locked/non-locked. When a thread changes the shared data, it locks it, the state of the resource is locked, the other thread cannot be changed, and until the thread frees the resource, the state of the resource becomes "non-locked", and the other thread can lock the resource again. The mutex ensures that only one thread is written at a time, thus guaranteeing the correctness of the data in multi-threaded situations.

Before the lock is introduced:

Import Threadingimport timenum = 0class MyThread (threading. Thread):    def run (self):        global num        num + = 1        time.sleep (0.5)        msg = self.name+ ' Set num to ' +str (num) C9/>print (msg) If __name__ = = ' __main__ ': for    I in range (5):        t = MyThread ()        T.start () out:thread-5 set num t o 2thread-3 set num to 3thread-2 set num to 5thread-1 set num to 5thread-4 set num to 4

After the lock is introduced:

Import Threadingimport timenum = 0class MyThread (threading. Thread):    def run (self, L):        L.acquire ()        global num        num + = 1        time.sleep (0.5)        msg = Self.name+ ' Set num to ' +str (num)        l.release ()        print (msg) if __name__ = = ' __main__ ':    lock = Threading. Lock () for    I in range (5):        t = MyThread ()        T.run (lock) out:thread-1 set num to 1thread-2 set num to 2thread-3 Set num to 3thread-4 set num to 4thread-5 set NUM to 5 add line lock

As can be seen, after adding the thread lock, try any number of executions, the end result is the same, eliminating the risk of thread insecurity. The process is this: we first set up a threading. Lock class object lock, in the Run method, we use Lock.acquire () to obtain this lock. At this point, the other threads will no longer be able to acquire the lock, and they will block the "L.acquire ()" Here until the lock is freed by another thread l.release ().

The lock mechanism above is called " mutex ". Only one thread is allowed to make change data at the same time. While the semaphore (Semaphore) is allowed a certain number of threads to change the data , such as the toilet has 3 pits, the maximum allows only 3 people to the toilet, the back of the people can only wait inside someone out to go in again.

Signal Volume:

Import Threading,time def run (n):    semaphore.acquire ()    time.sleep (1)    print ("Run the Thread:%s"%n)    Semaphore.release () if __name__ = = ' __main__ ':     num= 0    semaphore  = Threading. Boundedsemaphore (5) #最多允许5个线程同时运行 for    i in range:        t = Threading. Thread (target=run,args= (i,))        T.start ()

If multiple threads are to invoke multiple phenomena, and a thread calls a lock to occupy the A object, B thread calls the B lock takes up the B object, a thread cannot invoke the B object, B thread cannot invoke the A object, and so waits. This creates a thread "deadlock".
In the threading module, there is also a class, Rlock, which is called a reentrant lock. The lock object is internally maintained with a lock and a counter object. The Counter object records the number of acquire, so that resources can be require multiple times. Finally, when all Rlock are release, other threads can get the resources. In the same thread, Rlock.acquire can be called multiple times, using this feature to resolve partial deadlock problems .

To resolve the deadlock problem:

Class _rlock: "" This class implements Reentrant lock objects. A Reentrant Lock must is released by the thread that acquired it. Once a thread has acquired a reentrant lock and the same thread may acquire it again without blocking;    The thread must release it once for each time it has acquired it. "" "Def __init__ (self): Self._block = _allocate_lock () Self._owner = None # Initializes a counter self._ Count = 0 def __repr__ (self): Owner = Self._owner Try:owner = _active[owner].name exce PT Keyerror:pass Return "<%s%s.%s object owner=%r count=%d at%s>"% ("locked" if SE            Lf._block.locked () Else "unlocked", self.__class__.__module__, self.__class__.__qualname__,        Owner, Self._count, Hex (self)) def acquire (self, blocking=true, timeout=-1):        "" "Acquire a lock, blocking or non-blocking. When invoked without ARGuments:if This thread already owns the lock, increment the recursion level by one, and return immediately. Otherwise, if another thread owns the lock, block until the lock is unlocked. Once the lock is unlocked (not owned by any thread), then grab ownership, set the D return. If more than one thread is blocked waiting until the lock was unlocked, only one at a time would be able to GR AB ownership of the lock.        There is no return value in the case.  When invoked with the blocking argument set to True, does the same thing as when called without arguments, and return        True. When invoked and the blocking argument set to False, do not block.        If a call without an argument would block, return false immediately;        Otherwise, do the same thing as when called without arguments, and return true. When invoked with the floating-point timeout argument set to a positive value, Block for at is the number of seconds specified by timeout and as long as the the lock cannot is acquired.        Return true if the lock has been acquired and false if the timeout has elapsed.            "" "Me = Get_ident () If Self._owner = Me: # Every call to acquire, counter plus 1 Self._count + + 1 Return 1 rc = self._block.acquire (blocking, timeout) if Rc:self._owner = Me self . _count = 1 return rc __enter__ = Acquire DEF release (self): "" "Release a lock, decrementing the recur        Sion level. If after the decrement it was zero, reset the lock to unlocked (not owned by any thread), and if any other threads a Re blocked waiting for the lock to become unlocked, allow exactly one of them to proceed. If after the decrement the recursion level was still nonzero, the lock remains locked and owned by the Callin        G Thread. When the calling thread owns THe lock.        A RuntimeError is raised if this method was called when the lock is unlocked.        There is no return value. "" "if self._owner! = Get_ident (): Raise RuntimeError (" Cannot release un-acquired lock ") # every time you call re Lease, counter minus 1 Self._count = Count = Self._count-1 if not Count:self._owner = None sel F._block.release () def __exit__ (self, T, V, TB): Self.release () # Internal methods used by condition variable S def _acquire_restore (self, State): Self._block.acquire () self._count, Self._owner = State def _relea  Se_save (self): if Self._count = = 0:raise RuntimeError ("Cannot release un-acquired lock") Count =        Self._count Self._count = 0 Owner = Self._owner self._owner = None self._block.release () Return (count, owner) def _is_owned (self): return Self._owner = = Get_ident () Python3 implementation code in Rlock

Conditions for deadlocks to occur:
    • Mutex condition: Thread access to resources is exclusive, and if one thread is consuming a resource, then other threads must wait until the resource is freed.
    • Request and hold Condition: thread T1 has at least one resource R1 occupied, but R2 request for another resource, while the resource R2 is occupied by other threads T2, so the thread T1 also must wait, but the resources maintained by itself R1 not released.
    • No deprivation condition: the resources that the thread has obtained cannot be stripped by other threads until it is exhausted, and can only be released by itself after use.
    • Loop wait Condition: When a deadlock occurs, there must be a "process-resource ring chain", that is: {P0,P1,P2,... pn}, the process p0 (or thread) waits for P1 to occupy the resources, P1 waits for P2 to occupy the resources, PN waits for P0 to occupy the resources. (The most intuitive understanding is that P0 waits for P1 to occupy resources, while P1 waits for P0 to occupy resources, so two processes wait for each other)

Live Lock:

Live lock: Refers to thread 1 can use resources, but it is polite, let other threads use resources first, thread 2 can also use resources, but it is a gentleman, but also let other threads use resources first. So you let me, I let you, the last two threads are unable to use the resource.

2. Queues

Queues in Python, commonly used include the following four kinds: 1. FIFO queue 2. LIFO queue. 3. Priority Queue 4. Two-way queue. The use of these four types of queues is already defined in the queue module and is used as follows.

1) FIFO

Import queue# defines the queue maximum length q = queue. Queue (maxsize=5) # Q.empty () determines whether the queues are empty print (' If q is empty: ', Q.empty ()) # push element in queue Q.put (1) q.put (2) q.put (3) Q.put (4) # Determines whether the queue length reaches the upper limit of print (' If q is-full: ', Q.full ()) print (' Queue size: ', Q.qsize ())) Q.put (5) print (' If q was full: ', Q.full ()) # to queue Put data, if the queue is full, immediately throw an exception, do not block q.put_nowait (6) # put data into the queue, the default is block is true, block, timeout indicates the maximum wait time, if put failed during this time, throw an exception q.put (7 , Block=true, timeout=2) print (' If q is empty: ', Q.empty ()) # Returns the number of elements in the current queue print (' Queue size: ', Q.qsize ()) # Gets a single data print from the queue ( Q.get ()) print (Q.get ()) print (Q.get ()) print (Q.get ()) print (Q.get ()) # Q.get (block, timeout) the default block is true, indicating whether it is blocked. Timeout indicates the maximum time to block, and if the element cannot be acquired during this time, an exception is thrown when print (Q.get (block=true, timeout=2)) # Block=false, if get exception, immediately throws an error print (Q.get ( Block=false) # Gets the element from the queue and throws an exception if it gets an exception, without blocking. # Print (q.get_nowait ()) print (' If q is empty: ', Q.empty ()) queue. Queue () Usage

2) Last in, first out

Import Queueq = queue. Lifoqueue () q.put (1) q.put (2) q.put (3) print (' First get: ', Q.get ()) print (' Second get: ', Q.get ()) print (' Third get: ', Q.get ()) Out:first get:  3second get:  2third get:  1queue. Lifoqueue LIFO Queue

3) Priority to level

Import queueq1 = queue. Priorityqueue () # If the priority is the same, who first put it in, first take out who Q1.put ((1, ' alex1 ')) Q1.put ((2, ' alex2 ')) Q1.put ((1, ' alex3 ')) print (Q1.get ()) Print (Q1.get ()) print (Q1.get ()) Out: (1, ' alex1 ') (1, ' alex3 ') (2, ' alex2 ') queue. Priorityqueue () Priority queue

4) Bidirectional queue

Import queueq2 = Queue.deque () q2.append (1) q2.append (2) # Add data from the right Q2.append (3) Q2.appendleft (4) # Add data from the left Q2.extend ([5,6] # Add an Iterative object data from the right q2.extendleft ([7]) # Add an iterative object data from the left print (Q2) Q2.pop ()  # Fetch data from the right Q2.popleft () # from the left to take data print (Q2) out: Deque ([7, 4, 1, 2, 3, 5, 6]) deque ([4, 1, 2, 3, 5]) Queue.deque () bidirectional queue

9. Producer Consumer Model

One or more producers produce some type of data and are placed in a buffer (which can be an array or a data structure such as a queue); a consumer can fetch data from a buffer, fetch one item at a time, and ensure that the buffer is not duplicated. This means that at any time only one principal (producer or consumer) can access the buffer. The problem is to ensure that the buffer does not overflow, that is, when the buffer is full, the generator does not continue to add data to it, and when the buffer is empty, the consumer does not remove the data from it. The model can be solved: blocking problems and program decoupling. Below is a case of eating stuffed buns and making steamed buns code:

Import queueimport threadingimport timeq = queue. Queue def productor (ARG): While    true:        q.put (str (ARG) + ' Cook-out Bun ') def consumer (ARG): While    true:        Print (ARG, q.get ())        Time.sleep (2) for I in Range (3):    t = Threading. Thread (Target=productor, args= (i,))    T.start () for J in range:    t = Threading. Thread (Target=consumer, args= (j))    T.start () producer, consumer model case code

The job of a producer is to produce a piece of data, put it in buffer, and so loop. At the same time, consumers consume this data (such as removing them from buffer), one at a time. The key word here is "at the same time". So producers and consumers are running concurrently, and we need to do thread separation for producers and consumers . Suppose that the producer is responsible for keeping the data in the queue, and the consumer is responsible for fetching the data from the queue. These two threads run at the same time, and there is a situation where the consumer consumes everything and the producer hangs (sleep) at a certain point. The consumer tries to continue consuming, but the queue is empty and an exception occurs. We take this implementation as a false behavior (wrong behavior). The test code is as follows:

from threading Import Thread, lockimport timeimport randomqueue = []lock = Lock ()    Class Producerthread (Thread): def run (self): Nums = Range (5) #Will Create the list [0, 1, 2, 3, 4] global queue While true:num = Random.choice (nums) #Selects a random number from list [0, 1, 2, 3, 4] lock.acquire () que Ue.append (num) print ("produced", num) lock.release () Time.sleep (Random.random ()) class Consumerthread (Thread ): def run (self): Global queue while True:lock.acquire () if not queue:print (' Nothing ' in queue, b      UT consumer would try to consume ") exit () num = queue.pop (0) print (" Consumed ", num) lock.release () Time.sleep (Random.random ()) Producerthread (). Start () Consumerthread (). Start () out:produced 1Consumed 1Produced 0Produced 4Produced 1Consumed 0Consumed 4Produced 2Consumed 1Consumed 2Produced 3Consumed 3Nothing in queue, but consumer Would try to consume error behavior code 

So what should be the correct behavior?

When there is no data in the queue, the consumer should stop running and wait (wait) instead of trying to consume. And when the producer joins the data in the queue, there should be a channel to tell the (notify) consumer. The consumer can then consume from the queue again, and Indexerror no longer appears. and the condition (Condition) is used to solve this problem.

3. Process

Multithreading in Python is not really multi-threading, and if you want to fully use the resources of multicore CPUs, most of the situations in Python require multiple processes. Python provides a very useful multi-process package multiprocessing that only needs to define a function, and Python will do everything else. With this package, you can easily convert from single-process to concurrent execution. Multiprocessing supports sub-processes, communicates and shares data, performs different forms of synchronization, and provides components such as process, Queue, Pipe, lock, and so on.

1) Basic use

From multiprocessing import processdef foo (ARG):    print (' Say hi ', arg) if __name__ = = "__main__": for    I in range (10) :        p = Process (target=foo,args= (i,))        #p. Daemon = True  # equals the thread's threading. Thread.setdaemon        P.start ()        #p. Join () Out:say hi 0say hi 2say hi 4say hi 1say hi 3say hi 6say hi 8say hi 7say hi 5sa Y Hi 9 Basic use of multiple processes

2) Process Lock

When multiple processes require access to shared resources, lock can be used to avoid conflicting access.

Similarly, process locks contain other types, including Rlock,semaphore, which controls the number of accesses to a shared resource, such as the maximum number of connections to a pool. Event. Same as the thread. Belong to the module is different: multiprocessing. Lock (), multiprocessing. Rlock (), multiprocessing. Semaphore (n), multiprocessing. Event (). Use cases to view http://www.cnblogs.com/kaituorensheng/p/4445418.html or modify the thread case described above.

3) Inter-process resource sharing

By default, data between processes cannot be shared. However, data sharing can be done in the following three ways: queues, Array, Manager.dict ()

-Queues

From multiprocessing import processfrom multiprocessing import queuesimport multiprocessingdef foo (i,arg):    arg.put (i)    Print (' Say hi ', i,arg.qsize ()) If __name__ = = "__main__":    li = queues. Queue (20,ctx=multiprocessing) # in source code executed Ctx.lock for    i in range:        p = Process (target=foo,args= (I,li,))        P.start () Out:say hi 0 1say hi 2 2say hi 1 4say hi 3 4say hi 6 7say hi 5 7say hi 4 7say hi 7 8say hi 9 9say Hi 8 10multipro Cessing.queues

- Array (not used)

Arrays: Array Features: 1. Its memory address is contiguous, and the list is not 2. The data type of the element in the array is defined when it is created. 3. The number must be. At the time of creation, you need to specify

From multiprocessing import processfrom multiprocessing import arraydef foo (i,arg):    Arg[i] = i + "for"    item in a RG:        Print (item)     print (' ================ ') if __name__ = = "__main__":    li = Array (' I ', 5) # Create an array with a maximum of 5 elements For    I in range (5):        p = Process (target=foo,args= (I,li,))        P.start () out:00================1010=========== =====1010================101103================101103================multiprocessing. Array

- manager.dict (Common)

From multiprocessing import processfrom multiprocessing import managerdef foo (i,arg):    Arg[i] = i +    print (arg. VALUES ()) if __name__ = = "__main__":    obj = Manager ()    li = obj.dict () for    I in range (5):        p = Process (target= Foo,args= (I,li,))        #p. Daemon = True        p.start ()        p.join () out:[100][100, 101][100, 101, 102][100, 101, 102, 103 ][100, 101, 102, 103, 104]manager.dict for data sharing

4. Process Pool

The process pool is already defined in the multiprocessing module and can only be used directly. Contains 2 main methods apply and Apply_async

-Use of process pools:

From multiprocessing import poolimport timedef F1 (ARG):    print (ARG, ' B ')    Time.sleep (5)    print (ARG, ' a ') if __ name__ = = "__main__":    pool = Pool (5)    # define 30 tasks for    I in range:        # pool.apply (func=f1,args= (i)        ) Pool.apply_async (func=f1,args= (i))    # pool.close () # waits for all tasks to complete after the process has been completed, the status of the modification is close. Otherwise it's going to clog up here.    Time.sleep (2)    pool.terminate () # Immediately terminates all child processes    Pool.join ()  # The main process is here waiting for all the child processes to complete OUT:BBBBB  # execution to half has been forced to terminate the use of process finished with exit code 0 processes Pool

Apply (self, Func, args= (), kwds={}) # invokes the Func function with the arg and Kwds parameters, and the result is blocked until it returns, causing the child process to execute sequentially instead of executing concurrently

Apply_async (Self, Func, args= (), kwds={}, Callback=none,error_callback=none) # A variant of the Apply () method, which returns a result object. If callback is specified, then callback can receive a parameter and be called, and when the result is ready for the callback, the callback is called, and when the call fails, the callback is replaced with Error_callback. Callbacks should be completed immediately, otherwise the thread that processed the result will be blocked. Compared to the Apply method, the method does not block, similar to thread-Setdaemon

Pool.close () # prevents more tasks from being submitted to the pool, and the worker process exits when the task is completed

Pool.terminate () # Stops the worker process immediately, regardless of whether the task is completed or not. When the pool object process is garbage collected, terminate () is called immediately.

Pool.join () # Wait for the worker thread to exit, you must call Close () or terminate () before calling join (). This is because the terminated process needs to be called by the parent process wait (join equals wait), otherwise the process will become a zombie process.

Reference: Https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool

5. Co-process

1) Concept

The operation of the thread and process is triggered by the program to trigger the system interface, the final performer is the system, and the operation of the coprocessor is the programmer.

The significance of the existence of the process: for multi-threaded applications, the CPU by slicing the way to switch between threads of execution, thread switching takes time (save state, next continue). , only one thread is used, and a code block execution order is specified in one thread.

Application scenario: When there are a large number of operations in the program that do not require the CPU (IO), it is suitable for the association process;

2) Realization

The implementation of the co-process is mainly based on 2 modules, Greenlet and gevent (internal call Greenlet)

-Gevent

Import Geventdef fun1 ():  print ("www.baidu.com")  # First step  gevent.sleep (0)  print ("End the Baidu.com") # Step three Def fun2 ():  print ("www.zhihu.com")  # Step two  gevent.sleep (0)  print ("End th zhihu.com") # Fourth Step Gevent.joinall ([  gevent.spawn (FUN1),  gevent.spawn (fun2),]) gevent

-Greenlet

Import Greenletdef fun1 ():  print ("12") # First step  gr2.switch ()  print ("three")  # Third Step  Gr2.switch () Def fun2 ():  print ("34") # Second Step  gr1.switch ()  print ("78") # Fourth Step Gr1 = Greenlet.greenlet (fun1) GR2 = Greenlet.greenlet (FUN2) gr1.switch () Greenlet

The way of the Python-thread, process, co-path

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.