PYTHON36 multi-threaded, multi-process usage scenarios

Source: Internet
Author: User

Multi-threading and multi-process usage scenarios

Io operation does not consume CPU (from hard disk, from network, read data from memory to count IO)
Calculate CPU usage (e.g. COMPUTE)

A thread in Python is a dummy thread, and switching between different threads is resource-intensive because the context of the stored thread is required, and the constant switchover consumes resources.

Python multithreading is suitable for IO-intensive tasks (such as the socket Server network concurrency);
Python multithreading is not suitable for CPU-intensive tasks, mainly CPU-based, such as a large number of mathematical calculations.
So if there are CPU-intensive tasks to do, you can do it through multiple processes (not multithreading).
If the CPU has 8 cores, each core CPU can use 1 processes, each process can be calculated with 1 threads.
There is no need to use the Gil lock between processes because the process is independent and does not share data.
The process can be many, but the 8-core CPU can only operate on 8 tasks at a time.

Multi-process
测试多进程import multiprocessingimport timedef run(name):    time.sleep(2)    print (‘heelo‘,name)if __name__ == ‘__main__‘:    for i in range(10): #起了10个进程        p = multiprocessing.Process(target=run,args=(‘bob%s‘ %i,))        p.start()执行结果:heelo bob1heelo bob0heelo bob2heelo bob3heelo bob5heelo bob4heelo bob6heelo bob7heelo bob8heelo bob9##2秒左右就执行完成了,有几核CPU,同时就可以处理几个进程;当然要考虑你的电脑还开启了N多个其他应用程序,不过CPU计算比较快。
  Import multiprocessingimport time,threadingdef thread_run (): Print (Threading.get_ident ()) #get_ Ident gets the current thread iddef run (name): Time.sleep (2) print (' Heelo ', name) T = Threading.        Thread (Target=thread_run,) #在每个进程中又起了1个线程 T.start () if __name__ = = ' __main__ ': For I in range: #起了10个进程 p = multiprocessing. Process (target=run,args= (' bob%s '%i,)) P.start () execution result: Heelo bob016684heelo bob115052heelo Bob215260heelo Bob36192hee Lo bob46748heelo bob713980heelo bob56628heelo bob63904heelo bob92328heelo bob817072  
  from multiprocessing import processimport osdef info (title): Print (' Module name: ', __name__) Print (' Parent process: ', Os.getppid ()) #获取父进程的id print (' Process ID: ', os.getpid ()) #获取自身的id print ("\ n") def F (name): info (' \033[31;1mfunction f\033[0m ') print (' Hello ', name) if __name__ = = ' __main__ ': info (' \033[32;1mmain Process line\033[0m ') # #直接调用函数 # p = Process (target=f, args= (' Bob ',)) # P.start () # P.join () Execution result: main process Li Nemodule Name: __main__parent process:1136 #父进程ID, this parent process is pycharmprocess id:16724 #这个子进程就是python的代码程序 # #每个进程都会有一个父进程. 
from multiprocessing import Processimport osdef info(title):    print(title)    print(‘module name:‘, __name__)    print(‘parent process:‘, os.getppid())  #获取父进程的id    print(‘process id:‘, os.getpid())   #获取自身的id    print("\n\n")def f(name):    info(‘\033[31;1mcalled from child process function f\033[0m‘)    print(‘hello‘, name)if __name__ == ‘__main__‘:    info(‘\033[32;1mmain process line\033[0m‘)    p = Process(target=f, args=(‘bob‘,))    #设置子进程    p.start()   #启动子进程    # p.join()执行结果:main process linemodule name: __main__parent process: 1136    #主进程pycharmprocess id: 14684       #子进程python代码called from child process function fmodule name: __mp_main__parent process: 14684   #主进程python代码(1136的子进程)process id: 15884       #python代码(主进程14684)中的子进程的子15884## 每个进程都有主进程(父进程)hello bob
Inter-process communication

The data between the default processes is not shared, and if it is certain that the reciprocal visits can be achieved through the queue, the queue is used in the same way as the queue in the thread, but the queue in the thread can only be used between threads.

  thread Import queueimport threadingdef f (): Q.put ([42,none, ' Heelo ']) if __name__ = = ' __main__ ': q = queue. Queue () p = Threading. Thread (target=f,) P.start () print (Q.get ()) P.join () execution Result: [None, ' Heelo ']## put the data through a child thread and then get out of the content on the main thread, indicating that the data between the threads can be shared. 
  Process Import queuefrom multiprocessing import processdef F (): Q.put ([42,none, ' Heelo ']) #这里的q属于主进程if __name__ = = ' __main__ ': q = queue.    The Queue () #主进程起的q p = Process (Target=f,) # # Defines the child process in the main process, and if the child process is started in the main process, the memory between the main and child processes is independent.    # # Because of memory independence, the child process P is unable to access Q in the main process Def f (). P.start () print (Q.get ()) P.join () execution Result: Process Process-1:traceback (most recent call last): File "D:\python3.6.4\li b\multiprocessing\process.py ", line 258, in _bootstrap self.run () File" D:\python3.6.4\lib\multiprocessing\process.py ", line, in Run Self._target (*self._args, **self._kwargs) File" E:\python\ Code Practice \a3.py ", line 7, in F q.put ([42,no NE, ' heelo ']) nameerror:name ' Q ' is not defined# #可以看到已经报错, because the child process cannot access the main process's Q  
Import queuefrom multiprocessing import processdef F (QQ): Qq.put ([42,none, ' Heelo ']) if __name__ = = ' __main__ ': q = qu Eue. Queue () p = Process (target=f,args= (q,)) #将父进程q传给子进程 P.start () print (Q.get ()) P.join () execution Result: Traceback (most rec ENT call last): File "e:/python/code practice/a3.py", line A, in <module> p.start () file "D:\python3.6.4\lib\multiproce ssing\process.py ", line, in Start Self._popen = Self._popen (self) File" D:\python3.6.4\lib\multiprocessing\context . py ", line 223, in _popen return _default_context.get_context (). Process._popen (process_obj) File "D:\python3.6.4\lib\multiprocessing\context.py", line 322, _popen return Popen (pro cess_obj) File "D:\python3.6.4\lib\multiprocessing\popen_spawn_win32.py", line +, in __init__ reduction.dump (process _obj, To_child) File "D:\python3.6.4\lib\multiprocessing\reduction.py", line Max, in dump Forkingpickler (file, protocol ). Dump (obj) Typeerror:can ' t pickle _thread.lock objects## This is because we pass the thread's Q to another process, this is not possible, the thread belongs only to the current process and cannot be passed to other processes. # # If you want to pass Q to the child process, you must pass the process Q in instead of thread Q.
  from multiprocessing import process,queue# #大写的Queue是进程队列, queue is thread queue # #大写的Queue需要从multiprocessing导入def F (    QQ): Qq.put ([42,none, ' Heelo ']) if __name__ = = ' __main__ ': q = Queue () p = Process (target=f,args= (q,)) #将父进程q传给子进程 P.start () print (Q.get ()) #父进程去get子进程的内容 p.join () execution Result: [All, None, ' Heelo ']# #父进程可以get子进程put进去的内容了; superficially, it feels like two processes share data actually otherwise # # Now communication between processes has been achieved. The parent process passes Q to the child process, actually cloning a Q to the child process, at this time the child process has a Q process queue, but why the parent process can get the child process put in the data, because the current two processes in the memory space is still independent, but the child process put the data  The pickle serialization is placed in an intermediate position in memory, and the parent process takes the data from that middle position (not the data taken from the child process). Therefore, interprocess communication is not a shared data, but a transfer of data. 
进程之间的数据还可以通过管道的方式来通讯from multiprocessing import Process, Pipedef f(conn):    conn.send([42, None, ‘hello from child1‘])  #发送数据给parent_conn    conn.close()    #发完数据需要关闭if __name__ == ‘__main__‘:    parent_conn, child_conn = Pipe()    ## 生成管道。 生成时会产生两个返回对象,这两个对象相当于两端的电话,通过管道线路连接。    ## 两个对象分别交给两个变量。    p = Process(target=f, args=(child_conn,))   #child_conn需要传给对端,用于send数据给parent_conn    p.start()    print(parent_conn.recv())  #parent_conn在这端,用于recv数据    p.join()执行结果:[42, None, ‘hello from child1‘]
from multiprocessing import Process, Pipedef f(conn):    conn.send([42, None, ‘hello from child1‘])    conn.send([42, None, ‘hello from child2‘])  #发送两次数据    conn.close()if __name__ == ‘__main__‘:    parent_conn, child_conn = Pipe()    p = Process(target=f, args=(child_conn,))    p.start()    print(parent_conn.recv())      p.join()执行结果:[42, None, ‘hello from child1‘]## 可以看到这端只接收到了一次数据
  from multiprocessing import Process, Pipedef F (conn): Conn.send ([All, None, ' Hello from Child1 ']) Conn.sen D ([P, None, ' Hello from Child2 ']) conn.close () if __name__ = = ' __main__ ': parent_conn, child_conn = Pipe () p =    Process (Target=f, args= (Child_conn,)) P.start () print (PARENT_CONN.RECV ()) print (PARENT_CONN.RECV ()) #第二次接收数据 P.join () execution Result: [None, ' Hello from Child1 '][42, none, ' Hello from child2 ']# #对端发送几次, this end needs to be received several times  
  from multiprocessing import Process, Pipedef F (conn): Conn.send ([All, None, ' Hello from Child1 ']) Conn.sen    D ([All, None, ' Hello from Child2 ']) #发送两次数据 conn.close () if __name__ = = ' __main__ ': parent_conn, child_conn = Pipe () p = Process (Target=f, args= (Child_conn,)) P.start () print (PARENT_CONN.RECV ()) print (PARENT_CONN.RECV ()) PRI NT (PARENT_CONN.RECV ()) #对端发送两次, this section receives three p.join () execution results: [All, none, ' Hello from Child1 '][42, none, ' Hello from child2 ']## Cheng The order Card master, unless the peer sends the data once. 
from multiprocessing import Process, Pipedef f(conn):    conn.send([42, None, ‘hello from child1‘])    conn.send([42, None, ‘hello from child2‘])  #发送两次数据    print (conn.recv()) #接收数据    conn.close()if __name__ == ‘__main__‘:    parent_conn, child_conn = Pipe()    p = Process(target=f, args=(child_conn,))    p.start()    print(parent_conn.recv())    print(parent_conn.recv())    parent_conn.send("data from parent_conn")   #发送数据    p.join()执行结果:[42, None, ‘hello from child1‘][42, None, ‘hello from child2‘]data from parent_conn##通过管道实现了相互发送接收数据(实现了数据传递)
Inter-process data interaction and sharing
From multiprocessing import Process, Managerimport osdef f (d, L): d[1] = ' 1 ' #放入key和value到空字典中 d[' 2 '] = 2 d[0.25    ] = None l.append (os.getpid ()) #将每个进程的id值放入列表中; Each process has a different ID value.  Print (l) if __name__ = = ' __main__ ': With manager () as manager: #做一个别名 when the manager is equivalent to manager () d = manager.dict ()        #生成一个可在多个进程之间传递和共享的字典 L = manager.list (range (5)) #生成一个可在多个进程之间传递和共享的列表; Generates 5 data in a list by range (5) p_list = [] For I in range: #生成10个进程 p = Process (Target=f, args= (d, L)) #将字典和列表传给每个进程, each process can be modified P.sta RT () P_list.append (P) # put each process into an empty list for res in P_list:res.join () print (d) #所有进程 Print the dictionary after all finishes printing (l) #所有进程都执行完毕后打印列表执行结果: [0, 1, 2, 3, 4, 15788] #列表生成的时候自动加入了0-4 These 5 digits, and each process adds its own PID to the list. [0, 1, 2, 3, 4, 15788, 1568] [0, 1, 2, 3, 4, 15788, 1568, 7196] [0, 1, 2, 3, 4, 15788, 1568, 7196, 6544] [0, 1, 2, 3, 4, 15788, 1568, 7196, 6544, 9568] [0, 1, 2, 3, 4, 15788, 1568, 7196, 6544, 9568, 16952] [0, 1, 2, 3, 4, 15788, 1568, 7196, 6544, 9568, 16952, 15704][0, 1, 2, 3, 4, 15788, 1568, 7196, 6544, 9568, 16952, 15704, 14412][0, 1, 2, 3, 4, 15788, 1568, 7196, 6544, 9568, 16952, 15704, 14412, 5368][0, 1, 2, 3, 4, 15788, 1568, 7196, 6544, 9568, 16952 , 15704, 14412, 5368, 3092] #第10个进程打印的列表中有10个进程的pid {1: ' 1 ', ' 2 ': 2, 0.25:none} #最后打印的字典 [0, 1, 2, 3, 4, 15788, 1568, 71 6544, 9568, 16952, 15704, 14412, 5368, 3092] #最后打印的列表
From multiprocessing import Process, Managerimport osdef f (d, L): d[os.getpid ()] = Os.getpid () l.append (Os.getpid ())        Print (l) if __name__ = = ' __main__ ': With Manager () as Manager:d = Manager.dict () #对字典做个调整, PID is also added to the dictionary            L = manager.list (range (5)) p_list = [] for i in range: p = Process (Target=f, args= (d, L)) P.start () P_list.append (p) for RES in P_list:res.join () print (d) PRI  NT (L) execution results: [0, 1, 2, 3, 4, 2240][0, 1, 2, 3, 4, 2240, 10152][0, 1, 2, 3, 4, 2240, 10152, 10408][0, 1, 2, 3, 4, 2240, 10152, 10408, 6312][0, 1, 2, 3, 4, 2240, 10152, 10408, 6312, 17156][0, 1, 2, 3, 4, 2240, 10152, 10408, 6312, 17156, 6184][0, 1, 2, 3, 4, 2240, 10152, 10408, 6312, 17156, 6184, 16168][0, 1, 2, 3, 4, 2240, 10152, 10408, 6312, 17156, 6184, 16168, 11384] [0, 1, 2, 3, 4, 2240, 10152, 10408, 6312, 17156, 6184, 16168, 11384, 15976] [0, 1, 2, 3, 4, 2240, 10152, 10408, 6312, 17156, 6184, 16168, 11384, 15976, 16532]{2240:2240, 10,152:10,152, 10,408:10,408, 6,312:6,312, 17,156:17,156, 6,184:6,184, 16,168:16,168, 11,384:11,384, 1597 6:15976, 16532:16532}[0, 1, 2, 3, 4, 2240, 10152, 10408, 6312, 17156, 6184, 16168, 11384, 15976, 16532]# #现在我们看到可以实现进程间的数 Sharing, modification, and delivery. # #Manager () comes with a lock that controls the process of modifying data between processes at the same time; # #字典和列表的数据不是一份, but because of 10 processes, there are 10 dictionaries and 10 lists. Once each process is modified, it is copied to other processes, and other processes can modify the latest data so that the data is not altered.
Process synchronization

There's a lock inside the process.

from multiprocessing import Process, Lock   #从multiprocessing导入Lock这个锁def f(l, i):    l.acquire()     #获取修改数据的锁    print(‘hello world‘, i)    l.release()     #释放锁if __name__ == ‘__main__‘:    lock = Lock()   #实例锁    for num in range(10):   #生成10个进程        Process(target=f, args=(lock, num)).start() #执行子进程并传入参数给子进程执行结果:hello world 1hello world 4hello world 0hello world 3hello world 2hello world 5hello world 6hello world 8hello world 7hello world 9## 可以看到一共10个进程,并不是连续的,说明执行进程的时候说不准先执行哪个进程。##进程之间数据是独立的,这里我们为什么又要加锁呢,这是因为所有进程使用同一个屏幕来输出数据;比如 我们现在输出的数据是 hello world x,在输出的过程中很有可能其中一个进程还没输出完(比如只输出了hello wo),另一个进程就执行输出了(可能会在屏幕上看到hello wohello world0201的现象)。  所以需要通过锁来控制同一时间只能有一个进程输出数据到屏幕。
Process Pool

Performing a multi-process, the child process will copy a complete data from the main process, one, 10 processes may not feel, but if there are 100 or 1000, or even more of the time the overhead is particularly large, it will be obvious that the multi-process implementation of the lag phenomenon.

A process pool can set how many processes can run on the CPU at the same time.

From multiprocessing import Process, pool# imported from multiprocessing poolimport time,osdef Foo (i): Time.sleep (2) print ("in P Rocess ", Os.getpid ()) #打印进程id return i + 100def Bar (ARG): print ('-->exec done: ', arg) if __name__ = = ' __main__ ': #  #这行代码用途是如果主动执行该代码的 the. py file, the code below the code can be executed, if the. PY module is imported into another module, and the. PY module is executed from another module, the code below the line will not be executed.    There are times when you can use this method for testing, and write some test code under that line of code: Pool = Pool (5) #同时只能放入5个进程 for me in range: #创建10个进程, but because of the pool limit, only 5 processes that are placed in the process pool are executed (), others are suspended, and if two of the processes in the process pool are executed, the        In 2 processes. # Pool.apply_async (Func=foo, args= (i,), Callback=bar) pool.apply (Func=foo, args= (i,)) #pool. Apply to put the process into the pool PRI NT (' End ') #执行完毕 Pool.close () #允许pool中的进程关闭 (Close must be in front of the join, you can understand that close is equivalent to a switch bar) Pool.join () # Before the process in the process pool finishes executing and then close, if the note Then the program shuts down directly. Execution result: In process 2240in process 3828in process 16396in process 11848in process 11636in process 2240in process 3828in Proces s 16396in process 11848in process 11636end# #可以看到通过串行的方式将结果打印出来, this is because we are using pool.apply. The pool.apply is executed serially.
  from multiprocessing import process, Poolimport time,osdef Foo (i): Time.sleep (2) print ("In Process", OS.G Etpid ()) return i + 100def Bar (ARG): print ('-->exec done: ', arg) if __name__ = = ' __main__ ': Pool = Pool (5) F or I in range (Ten): Pool.apply_async (Func=foo, args= (i)) # # # # # You can use Pool.apply_async to parallel the print (' End ') pool . Close () # Pool.join () comment out the execution result: end## only executes the print (' End ') code, the results of the other process are not seen because the other process has not finished executing, the main process Pool.close () is finished, Close all other processes will not be executed at the same time. # # to be closed after other processes have finished executing, you must use Pool.join ()  
from  multiprocessing import Process, Poolimport time,osdef Foo(i):    time.sleep(2)    print("in process",os.getpid())    return i + 100def Bar(arg):    print(‘-->exec done:‘, arg)if __name__ == ‘__main__‘:    pool = Pool(5)    for i in range(10):        pool.apply_async(func=Foo, args=(i,))    print(‘end‘)    pool.close()    pool.join()执行结果:endin process 13272in process 14472in process 3724in process 9072in process 15068in process 13272in process 14472in process 3724in process 9072in process 15068##从执行结果来看,5个 5个的被打印出来。
Callback
From multiprocessing import process, Poolimport time,osdef Foo (i): Time.sleep (2) print ("In Process", Os.getpid ())    return i + 100def Bar (ARG): print ('-->exec done: ', arg,os.getpid ()) If __name__ = = ' __main__ ': Pool = Pool (5)        Print ("Main process:", Os.getpid ()) #打印主进程id for I in range: Pool.apply_async (Func=foo, args= (i,), Callback=bar)        # #callback叫做回调, the Callback=bar executes after the Func=foo is executed (callbacks are executed after each process has finished executing).        The # # callback can be used to do some follow-up when the code is finished, such as after the command has been viewed, backed up by a callback, or after performing any action, log, etc. # # backups, write logs, etc. can also be performed in sub-processes, but why use callbacks!        This is because if you use a child process, there are 10 child processes to connect to the database 10 times, and the use of the callback is the main process to connect to the database, so only connect once, so write can greatly improve operational efficiency.    # #通过主进程建立数据库的连接的话, because only one connection can be made to the database in the same process, the connection is not repeated even if the process is callback multiple times, because the database restricts the maximum number of connections to the same process, which is set by the database. Print (' End ') pool.close () Pool.join () execution Result: main process: 12776 #主进程是12766endin process 7496-->exec done:100 1 2776 #这里可以看出回调是通过主进程调用的in Process 3324-->exec done:101 12776in process 16812-->exec done:102 12776in Process 1 0876-->exec done:103 12776in PROcess 8200-->exec done:104 12776in process 7496-->exec done:105 12776in process 3324-->exec done:106 12776in p Rocess 16812-->exec done:107 12776in process 10876-->exec done:108 12776in process 8200-->exec done:109 12776

Python36 multi-threaded, multi-process usage scenarios

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.