Variable sharing between threads in Python multi-process Programming
This article mainly introduces how to share variables between threads in Python multi-process programming. multi-process programming is an important advanced knowledge in Python learning. For more information, see
1. Problem:
Some people in the group pasted the following code and asked why the list finally printed a null value?
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
From multiprocessing import Process, Manager Import OS Manager = Manager () Vip_list = [] # Vip_list = manager. list () Def testFunc (cc ): Vip_list.append (cc) Print 'process id: ', OS. getpid () If _ name _ = '_ main __': Threads = [] For ll in range (10 ): T = Process (target = testFunc, args = (ll ,)) T. daemon = True Threads. append (t) For I in range (len (threads )): Threads [I]. start () For j in range (len (threads )): Threads [j]. join () Print "------------------------" Print 'process id: ', OS. getpid () Print vip_list |
In fact, if you understand the multi-threaded model and GIL of python, and then understand the multi-threaded and multi-process principles, it is not difficult to answer the above questions, but it does not matter if you do not know, run the above Code and you will know what the problem is.
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Python aa. py Process id: 632 Process id: 635 Process id: 637 Process id: 633 Process id: 636 Process id: 634 Process id: 639 Process id: 638 Process id: 641 Process id: 640 ------------------------ Process id: 619 [] |
Enable the comments on line 1, and you will see the following results:
?
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Process id: 32074 Process id: 32073 Process id: 32072 Process id: 32078 Process id: 32076 Process id: 32071 Process id: 32077 Process id: 32079 Process id: 32075 Process id: 32080 ------------------------ Process id: 32066 [3, 2, 1, 7, 5, 0, 6, 8, 4, 9] |
2. python multi-process variable sharing methods:
(1) Shared memory:
Data can be stored in a shared memory map using Value or Array. For example, the following code
Http://docs.python.org/2/library/multiprocessing.html#sharing-state-between-processes
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
From multiprocessing import Process, Value, Array Def f (n, ): N. value = 3.1415927 For I in range (len ()): A [I] =-a [I] If _ name _ = '_ main __': Num = Value ('D', 0.0) Arr = Array ('I', range (10 )) P = Process (target = f, args = (num, arr )) P. start () P. join () Print num. value Print arr [:] |
Result:
?
1 2 |
3.1415927 [0,-1,-2,-3,-4,-5,-6,-7,-8,-9] |
(2) Server process:
A manager object returned by Manager () controls a server process which holds Python objects and allows other processes to manipulate them using proxies.
A manager returned by Manager () will support types list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Queue, Value and Array.
For the code, see the example at the beginning.
Http://docs.python.org/2/library/multiprocessing.html#managers
3. multi-process problems: Data Synchronization
Look at the simple code: A simple counter:
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
From multiprocessing import Process, Manager Import OS Manager = Manager () Sum = manager. Value ('tmp ', 0) Def testFunc (cc ): Sum. value + = cc If _ name _ = '_ main __': Threads = [] For ll in range (1, 100 ): T = Process (target = testFunc, args = (1 ,)) T. daemon = True Threads. append (t) For I in range (len (threads )): Threads [I]. start () For j in range (len (threads )): Threads [j]. join () Print "------------------------" Print 'process id: ', OS. getpid () Print sum. value |
Result:
?
1 2 3 |
------------------------ Process id: 17378 97 |
Maybe you will ask: WTF? In fact, this problem exists in the multi-thread era, but in the multi-process era, it has been repeated: Lock!
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
From multiprocessing import Process, Manager, Lock Import OS Lock = Lock () Manager = Manager () Sum = manager. Value ('tmp ', 0) Def testFunc (cc, lock ): With lock: Sum. value + = cc If _ name _ = '_ main __': Threads = [] For ll in range (1, 100 ): T = Process (target = testFunc, args = (1, lock )) T. daemon = True Threads. append (t) For I in range (len (threads )): Threads [I]. start () For j in range (len (threads )): Threads [j]. join () Print "------------------------" Print 'process id: ', OS. getpid () Print sum. value |
What is the performance of this code? Run it, or increase the number of cycles...
4. Final suggestions:
Note that usually sharing data between processes may not be the best choice, because of all the synchronization issues; an approach involving actors exchanging messages is usually seen as a better choice. see also Python documentation: As mentioned above, when doing concurrent programming it is usually best to avoid using shared state as far as possible. this is particle ly true when using multiple processes. however, if you really do need to use some shared data then multiprocessing provides a couple of ways of doing so.
5. Refer:
Http://stackoverflow.com/questions/14124588/python-multiprocessing-shared-memory
Http://eli.thegreenplace.net/2012/01/04/shared-counter-with-pythons-multiprocessing/
Http://docs.python.org/2/library/multiprocessing.html#multiprocessing.sharedctypes.synchronized