Python initially gave me the impression that it is easy to get started, suitable for application development, simple programming, a third-party library and many other advantages, and attracted me to go deep into the study. Until after learning multithreaded programming, in their own environment to verify this sentence: Thepython interpreter after the introduction of Gil Lock, multi-CPU scenario, is no longer parallel operation, and even worse than the serial performance. There are some gaps, the beginning of the doomed to the language is a ceiling, for some parallel high-demand systems, Python may no longer be the first choice, or even completely regardless. But things are not absolutely pessimistic, we have seen a large number of people are working to optimize this feature, the new version of the older version also has a certain improvement, some core modules we can also choose other modules development and other measures.
1. Python multithreaded programming
Threading is a common library for multi-threaded programming in Python, there are two ways to achieve Multithreading: 1, call the library interface into function and parameter execution, 2, custom thread class inherits threading. Thread, and then rewrite the __init__ and run methods.
1. Call Library interface to function and parameter execution
ImportThreadingImportQueueImport Time" "implementation function: Defines a FIFO queue,10 element, 3 threads simultaneously to obtain" "#initializing a FIFO queueQ =queue. Queue () forIinchRange (10): Q.put (i)Print("%s:init queue,size:%d"%(Time.ctime (), Q.qsize ()))#thread function function to get queue datadefRun (q,threadid): Is_empty=False while notIs_empty:if notq.empty (): Data=Q.get ()Print("Thread%d get:%d"%(Threadid,data)) Time.sleep (1) Else: Is_empty=True#define a list of threadsThread_handler_lists = []#Initializing Threads forIinchRange (3): Thread= Threading. Thread (Target=run,args =(Q,i)) Thread.Start () thread_handler_lists.append (thread)#wait for the thread to finish executing forThread_handlerinchThread_handler_lists:thread_handler.join ()Print("%s:end of Progress"% (Time.ctime ()))
View Code
2, Custom thread class inherits threading. Thread, and then rewrite the __init__ and run methods
As with other languages, in order to ensure data consistency between multiple threads, the threading library comes with a lock function, involving 3 interfaces:
Thread_lock = Threading. Lock () creates a lock object
Thread_lock.acquire () Get lock
Thread_lock.release () Release lock
Note: Because the Python module queue has been implemented multithreaded security, the actual encoding, no longer requires the operation of the lock, here is just a programming demonstration.
ImportThreadingImportQueueImport Time" "implementation function: Define a FIFO queue,10 elements, 3 threads simultaneously to obtain queue thread-safe queues, so do not need to add Thread_lock.acquire () thread_lock.release ()" "#customizes a thread class, inheriting threading. Thread, rewrite the __init__ and run methods toclassMyThread (Threading. Thread):def __init__(SELF,THREADID,NAME,Q): Threading. Thread.__init__(self) self.threadid=ThreadID self.name=name Self.q=QPrint("%s:init%s success."%(Time.ctime (), self.name))defRun (self): Is_empty=False while notIs_empty:thread_lock.acquire ()if notq.empty (): Data=Self.q.get ()Print("Thread%d get:%d"%(Self.threadid,data)) Time.sleep (1) thread_lock.release ()Else: Is_empty=True thread_lock.release ()#Define a lockThread_lock =Threading. Lock ()#define a FIFO queueQ =queue. Queue ()#define a list of threadsThread_name_list = ["Thread-1","Thread-2","Thread-3"]thread_handler_lists= []#Initialize QueueThread_lock.acquire () forIinchRange (10): Q.put (i) thread_lock.release ()Print("%s:init queue,size:%d"%(Time.ctime (), Q.qsize ()))#Initializing Threadsthread_id = 1 forThread_nameinchThread_name_list:thread=MyThread (thread_id,thread_name,q) Thread.Start () thread_handler_lists.append (thread) thread_id+ = 1#wait for the thread to finish executing forThread_handlerinchThread_handler_lists:thread_handler.join ()Print("%s:end of Progress"% (Time.ctime ()))
View Code
2. Python Multithreading mechanism analysis
Before the discussion, let's start with a few concepts:
parallel and concurrency
The key to concurrency is that you have the ability to handle multiple tasks, not necessarily at the same time. The key to parallelism is that you have the ability to handle multiple tasks at the same time. I think their most critical point is whether it's "at the same time," or that parallelism is a subset of concurrency.
GIL
GIL: Global interpretation Lock, Python interpreter level lock, in order to ensure that the program itself is working properly, such as Python's automatic garbage collection mechanism, while our program is running, also in the garbage cleanup work.
An attempt was made to simulate the execution of 3 threads in a process:
1. T1, T2, T3 thread is in the ready state while getting the Gil lock to the Python interpreter
2. Assume that T1 acquires the Gil Lock, is assigned to any CPU by Python, and is in a running state
3, Python based on some kind of scheduling mode (for example, Pcode), will let T1 release Gil Lock, re-in Ready state
4, repeat 1 steps, assuming at this time T2 get to Gil Lock, run the process as above, by Python assigned to any CPU execution, in a running state, Python based on some kind of scheduling mode (such as Pcode), will let T2 release Gil Lock, re-ready state
5, finally can be pushed to T1, T2, T3 as follows 1, 2, 3, 4 mode serial operation
Therefore, although T1, T2, T3 are three threads, theoretically can run in parallel, but in fact the Python interpreter introduced Gil Lock, multi-CPU scenario, is no longer parallel mode of operation, or even worse than the serial performance , let us do a test:
We write two computational functions to test the time overhead of single-threaded and multi-threaded, with the following code:
ImportThreadingImport Time#define two functions with large computational capacitydefsum (): Sum=0 forIinchRange (100000000): Sum+=Idefmul (): Sum=0 forIinchRange (10000000): Sum*=I#single-Thread time testStartTime =time.time () sum () mul () Endtime=Time.time () period= Endtime-StartTimePrint("The single thread cost:%d"%(period))#Multi-threaded time testingStartTime =time.time () L=[]t1= Threading. Thread (target =sum) T2= Threading. Thread (target =sum) l.append (t1) l.append (T2) forIinchL:i.start () forIinchl:i.join () Endtime=Time.time () period= Endtime-StartTimePrint("The mutiple thread cost:%d"%(period))Print("End of program.")
View Code
The test found that the time overhead of multi-threading was actually larger than a single thread:
The result is a bit unacceptable, is there any way to optimize it? The answer is some, such as multi-threading into multiple processes, but considering the process overhead problem, the actual programming, can not open too many processes, the other process + the co-path may also improve a certain performance, here temporarily no further analysis.
Interested can continue to read the following link blog: http://cenalulu.github.io/python/gil-in-python/
Day-3 talk about Python multithreaded programming those things