This article brings you a detailed description of the multi-threading in Python (code example), there is a certain reference value, the need for friends can refer to, I hope to help you.
This article records the problems encountered in learning Python and some common uses, and notes that the Python version of this development environment is 2.7.
First, Python file naming
When naming python files, be careful not to conflict with the system's default module name, or you will get an error.
As the following example, when learning a thread, name the file name threading.py
, the Python script is completely normal, and the result is reported in the following error: AttributeError: 'module' object has no attribute 'xxx'
.
threading.py
#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_test.py@time:18/8/25 09:14" "Import threading# Gets the number of threads activated print (Threading.active_count ())
Perform:
➜ Baselearn python threading/threading.pytraceback (most recent call last): File "threading/threading.py", Line 9, in <module> import threading File "/users/kaiyiwang/code/python/baselearn/threading/ threading.py ", line <module> print (Threading.active_count ()) Attributeerror: ' Module ' object have no Attribute ' Active_count ' ➜ baselearn
Problem Locator:
View the source file for import
the library, discover that the source file exists without errors, and that the source file exists .pyc
Problem solving:
1. when naming the py script, do not match the python reserved word, module name, etc.
2. Delete the library's .pyc
File (because the. pyc file is generated each time the py script runs; If the code does not update, the runtime will still go PYC if the. pyc file has been generated, so delete the. pyc file), rerun the code, or find an environment where you can run the code, and copy the. pyc file that replaces the current machine.
Rename the script file name to threading_test.py
, and then execute it, without an error.
➜ baselearn python threading/threading_test.py1➜ Baselearn
Second, multithreading threading
Multithreading is an efficient way to accelerate program computation, and Python's multithreaded modules are threading
quick and easy to get started with, and from this section we teach you how to use it.
1. Adding threads
threading_test.py
#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_test.py@time:18/8/25 09:14" "" Import threading# get the number of threads activated # Print (Threading.active_count ()) # View all thread information # print (Threading.enumerate ()) # View the threads now running # print (Threading.current_ Thread ()) def thread_job (): print (' This is a thread of%s '% Threading.current_thread ()) def main (): thread = Threa Ding. Thread (Target=thread_job,) # defines threads Thread.Start () # Let the thread start working if __name__ = = ' __main__ ': Main ()
2. Join function
The result of not adding join ()
We let T1
the threads work more time-consuming
threading_join.py
#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_join.py@time:18/8/25 09:14" "Import Threadingimport Timedef Thread_job (): print (' T1 start\n ') for I in range: Time.sleep (0.1) # task time 0.1s print ("T1 finish\n") def main (): Added_thread = Threading. Thread (target=thread_job, name= ' T1 ') # defines threads Added_thread.start () # Let the thread start working print ("All done\n") if __name_ _ = = ' __main__ ': Main ()
The results of the expected output are executed sequentially, in sequence:
T1 startT1 Finishall Done
But the actual result is:
➜ baselearn python threading/threading_join.pyt1 startall doneT1 finish➜ Baselearn
Result of joining join ()
The thread task is not finished yet output all done
. If you want to follow the order, you can call it after you start the thread join
:
Added_thread.start () Added_thread.join () print ("All done\n")
Printing results:
➜ baselearn python threading/threading_join.pyt1 startT1 finishall done
Full script file:
#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_join.py@time:18/8/25 09:14" "Import Threadingimport Timedef Thread_job (): print (' T1 start\n ') for I in range: Time.sleep (0.1) # task time 0.1s print ("T1 finish\n") def main (): Added_thread = Threading. Thread (target=thread_job, name= ' T1 ') # defines threads Added_thread.start () # Let the thread start working added_thread.join () Print ("All done\n") if __name__ = = ' __main__ ': Main ()
Small trial Sledgehammer
If you add two threads, what is the output of the printout?
#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_join.py@time:18/8/25 09:14" "Import Threadingimport Timedef T1_job (): print (' T1 start\n ') for I in range: Time.sleep (0.1) # task time 0.1s print ("T1 finish\n") def T2_job (): print ("T2 start\n") print ("T2 finish\n") def Main (): thread_1 = Threading. Thread (target=t1_job, name= ' T1 ') # defines threads thread_2 = Threading. Thread (target=t2_job, name= ' T2 ') # Defining Threads Thread_1.start () # Open T1 thread_2.start () # Open T2 print ("All done\n") if __name__ = = ' __main__ ': Main ()
The "One" result of the output is:
T1 startT2 startT2 finishall doneT1 Finish
Now T1 and T2 are not join
, note that the "one" here is because the appearance of all do depends entirely on the execution speed of two threads, and it is entirely possible that T2 finish appears after all is done. This messy execution is intolerable to us, so use join to control it.
Let's try the T2 before starting the T1, plus thread_1.join()
:
Thread_1.start () Thread_1.join () # Notice the Difference!thread_2.start () print ("All done\n")
Printing results:
T1 startT1 finishT2 startall doneT2 Finish
As you can see, T2 waits for T1 to finish before it starts to run.
3. Storage Process Result Queue
Implementation features
Code implementation, passing data from a list of data, using four threads, saving the results in a queue, and getting the stored results from the queue after the thread finishes executing
Define one in a multithreaded function to Queue
hold the return value, 代替return
define a multithreaded list, and initialize a multidimensional data list to handle:
threading_queue.py
#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_queue.py@time:18/8/25 09:14" "Import threadingimport Timefrom Queue Import Queuedef Job (L, Q): For I in range (Len (l)): l[i] = l[i] * * 2 q.put (l) #多线程调用的函数不能用return return value def multithreading (): q = Queue () #q中存放返回值, instead of return value threads = [] data = [[1,2,3],[3,4,5],[ 4,4,4],[5,5,5]] for i in range (4): #定义四个线程 t = Threading. Thread (Target=job, args= (Data[i], q)) #Thread首字母要大写, the called job function has no parentheses, just an index, and the parameter is T.start () in the back #开始线程 Threads.append (t) #把每个线程append到线程列表中 for thread in Threads: thread.join () results = [] for _ in Range (4): results.append (Q.get ()) #q. Get () takes a value in order from Q to print (results) if __name__ = = ' __main__ ': Multithreading ()
There was an error in executing the script above:
➜ Baselearn python threading/threading_queue.pytraceback (most recent call last): File "Threading/threading_ queue.py ", line one, in <module> from queue import queueimporterror:no module named queue
The reason for this is due to the Python version:
Workaround: No module named ' Queue '
On Python 2, the module is a named Queue, on Python 3, it's renamed to follow PEP8 guidelines (all lowercase for module na MES), making it queue. The class remains Queue on all versions (following PEP8).
Typically, the "the" to "D" write version portable imports would is to do:
Python3 in this reference:
Try: import queueexcept importerror: import queue as queue
In Python2 we can quote:
From queue import queue
Print:
Baselearn python./threading/threading_queue.py[[1, 4, 9], [9, 16, 25], [16, 16, 16], [25, 25, 25]]
Full code:
threading_queue.py
#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_queue.py@time:18/8/25 09:14" "Import threading# Import Timefrom Queue Import queuedef Job (L, Q): For I in range (Len (l)): l[i] = l[i] * * 2 q.put (l) #多线程调用的函数不能用return return value def multithreading (): q = Queue () #q中存放返回值, instead of return value threads = [] data = [[1,2,3],[3,4,5],[ 4,4,4],[5,5,5]] for i in range (4): #定义四个线程 t = Threading. Thread (Target=job, args= (Data[i], q)) #Thread首字母要大写, the called job function has no parentheses, just an index, and the parameter is T.start () in the back #开始线程 Threads.append (t) #把每个线程append到线程列表中 for thread in Threads: thread.join () results = [] for _ in Range (4): results.append (Q.get ()) #q. Get () takes a value in order from Q to print (results) if __name__ = = ' __main__ ': Multithreading ()
4, Gil Efficiency problem
What is GIL?
This time, let's see why Python's multithreaded threading are sometimes not particularly ideal. The main reason is that Python design, there is a necessary link, that is Global Interpreter Lock (GIL)
. This thing allows Python to handle only one thing at a time.
Gil's explanation:
Although Python fully supports multithreaded programming, the C-language implementation of the interpreter is not thread-safe in fully parallel execution. In fact, the interpreter is protected by a global interpreter lock, which ensures that only one Python thread executes at any time. The biggest problem with the Gil is that Python's multithreaded programs do not take advantage of multicore CPUs (such as a computationally intensive program that uses multiple threads that only runs on a single CPU). Before discussing the ordinary Gil, one thing to emphasize is that the Gil only affects those programs that rely heavily on the CPU (for example, computational type). If most of your programs only involve I/O, such as network interaction, then using multithreading is appropriate because they are waiting most of the time. In fact, you can safely create thousands of Python threads, and the modern operating system runs so many threads without any pressure, nothing to worry about.
Test Gil
We create one job
that executes the program in threading and in a general way. and create a list to hold the data we want to work with. In Normal, we expand this list 4 times times, in threading, we build 4 lines and compare the elapsed time.
threading_gil.py
#-*-coding:utf-8-*-"" "@author: CORWIEN@FILE:THREADING_GIL.PY@TIME:18/8/25 09:14 "" "Import threadingfrom Queue import queueimport copyimport timedef Job (L, q): res = SUM (l) q.put (L) #多线程调用的函数不 Can return a value with return def multithreading (l): q = Queue () #q中存放返回值, instead of return value threads = [] for I in range (4): #定义四个线程 t = Threading. Thread (Target=job, args= (Copy.copy (L), q), name= "t%i"% i) #Thread首字母要大写, the called job function has no parentheses, just an index, the parameter is in the back T.start () #开始线 Cheng Threads.append (t) #把每个线程append到线程列表中 [T.join () for T-threads] Total = 0 for _ in range (4): Tot Al = Q.get () #q. Get () takes a value from Q in order () print (total) def normal (L): all = SUM (l) print (total) if __name__ = = ' __main__ ': L = List (range (1000000)) s_t = Time.time () normal (l*4) print (' Normal: ', Time.time ()-s_t) s_t = Time.time () multithreading (L) Print (' Multithreading: ', Time.time ()-s_t)
If you run the entire program successfully, you will probably have this output. The result of our operation is correct, so the program threading and Normal run the same number of operations. But we find that threading is not much faster, supposedly, we expect to be 3-4 times faster because there are 4 threads built, but not. This is where the GIL is at mischief.
1999998000000normal: 0.100346088409423831999998000000multithreading: 0.08421492576599121
5. Thread Lock Lock
Cases where Lock is not used
threading_lock.py
#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_lock.py@time:18/8/25 09:14" "" Import threading# the value of global variable a adds 1 each time, Loop 10 times, and Print Def JOB1 (): Global A for I in range: a+=1 print (' Job1 ', a) # global variable A values each add 10, loop 10 times, and print Def JOB2 (): global A for I in range: a+=10 print (' Job2 ', A) # defines two threads, performing function one and function two if __name__== ' __main_ _ ': a=0 t1=threading. Thread (TARGET=JOB1) t2=threading. Thread (TARGET=JOB2) T1.start () T2.start () t1.join () t2.join ()
Print output data:
➜ Baselearn python./threading/threading_lock.py (' Job1 ', (' job2 ' 1), (one) ' Job1 ' (' job2 ', ') ' (' job2 ', +) (' Job2 ', 42 (' Job2 ', ' job2 ') (' job2 ', ' a ') (' job2 ', ' a ') (' Job2 ', ' () ') (' job2 ', ' 102 '), (' Job1 ', ' 103 ') (' job1 ', ' and ' 104 ') (' A ', ' job1 ', ' 105 ' Job1 ', 106) (' Job1 ', 107) (' Job1 ', 108) (' Job1 ', 109) (' Job1 ', 110)
As you can see, the printing results are very confusing.
The use of Lock is the case
Lock is used by different threads to 共享内存
ensure that threads do not affect each other, and lock is done by locking the shared memory before each thread performs an operation to modify the shared memory, ensuring that the lock.acquire()
current thread executes, that the memory is not accessed by other threads, and that after the operation is completed, the lock.release()
Open the lock to ensure that the shared memory is available to other threads.
function one and function two plus lock
Def job1 (): global A,lock lock.acquire () for I in range: a+=1 print (' Job1 ', A) Lock.release () def job2 (): global A,lock lock.acquire () for I in range: a+=10 print (' Job2 ', A) lock.release ()
The main function defines aLock
If __name__== ' __main__ ': lock=threading. Lock () a=0 t1=threading. Thread (TARGET=JOB1) t2=threading. Thread (TARGET=JOB2) T1.start () T2.start () t1.join () t2.join ()
Full code:
#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_lock.py@time:18/8/25 09:14" "" Import Threadingdef job1 (): Global A,lock lock.acquire () for I in range: a+=1 print (' Job1 ', A) lock.release () Def job2 (): global A,lock lock.acquire () for I in range: a+=10 print (' Job2 ', A) Lock.release () If __name__== ' __main__ ': lock = Threading. Lock () a=0 t1=threading. Thread (TARGET=JOB1) t2=threading. Thread (TARGET=JOB2) T1.start () T2.start () t1.join () t2.join ()
Print output:
➜ Baselearn python./threading/threading_lock.py (' Job1 ', 1) (' Job1 ', 2) (' Job1 ', 3) (' Job1 ', 4) (' Job1 ', 5) (' Job1 ', 6) (' Job1 ', 7) (' Job1 ', 8) (' Job1 ', 9) (' Job1 ', 10) (' Job2 ', 20) (' Job2 ', 30) (' Job2 ', 40) (' Job2 ', 50) (' Job2 ', 60) (' Job2 ', 70) (' Job2 ', 80) (' Job2 ', 90) (' Job2 ', 100) (' Job2 ', 110)
From the printed result, lock
after use, one thread finishes executing. With lock
and without lock, the result of the final printout is different.