Two ways to turn on threads
#方式一
from threading import Thread
import time
def sayhi(name):
time.sleep(2)
print(‘%s say hello‘ %name)
if __name__ == ‘__main__‘:
t=Thread(target=sayhi,args=(‘egon‘,))
t.start()
print(‘主线程‘)
#方式二
from threading import Thread
import time
class Sayhi(Thread):
def __init__(self,name):
super().__init__()
self.name=name
def run(self):
time.sleep(2)
print(‘%s say hello‘ % self.name)
if __name__ == ‘__main__‘:
t = Sayhi(‘egon‘)
t.start()
print(‘主线程‘)
The difference between opening multiple threads under one process and opening multiple sub-processes under one process
Whose opening speed is fast
from threading import Thread
from multiprocessing import Process
import os
def work():
print(‘hello‘)
if __name__ == ‘__main__‘:
#在主进程下开启线程
t=Thread(target=work)
t.start()
print(‘主线程/主进程‘)
‘‘‘
打印结果:
hello
主线程/主进程
‘‘‘
#在主进程下开启子进程
t=Process(target=work)
t.start()
print(‘主线程/主进程‘)
‘‘‘
打印结果:
主线程/主进程
hello
‘‘‘
A look at the PID
from threading import Thread
from multiprocessing import Process
import os
def work():
print(‘hello‘,os.getpid())
if __name__ == ‘__main__‘:
#part1:在主进程下开启多个线程,每个线程都跟主进程的pid一样
t1=Thread(target=work)
t2=Thread(target=work)
t1.start()
t2.start()
print(‘主线程/主进程pid‘,os.getpid())
#part2:开多个进程,每个进程都有不同的pid
p1=Process(target=work)
p2=Process(target=work)
p1.start()
p2.start()
print(‘主线程/主进程pid‘,os.getpid())
Do threads in the same process share the process's data?
from threading import Thread
from multiprocessing import Process
import os
def work():
global n
n=0
if __name__ == ‘__main__‘:
# n=100
# p=Process(target=work)
# p.start()
# p.join()
# print(‘主‘,n) #毫无疑问子进程p已经将自己的全局的n改成了0,但改的仅仅是它自己的,查看父进程的n仍然为100
n=1
t=Thread(target=work)
t.start()
t.join()
print(‘主‘,n) #查看结果为0,因为同一进程内的线程之间共享进程内的数据
Other thread-related methods
Thread实例对象的方法
# isAlive(): 返回线程是否活动的。
# getName(): 返回线程名。
# setName(): 设置线程名。
threading模块提供的一些方法:
# threading.currentThread(): 返回当前的线程变量。
# threading.enumerate(): 返回一个包含正在运行的线程的list。正在运行指线程启动后、结束前,不包括启动前和终止后的线程。
# threading.activeCount(): 返回正在运行的线程数量,与len(threading.enumerate())有相同的结果。
复制代码
The main thread waits for the child thread to end
from threading import Thread
import time
def sayhi(name):
time.sleep(2)
print(‘%s say hello‘ %name)
if __name__ == ‘__main__‘:
t=Thread(target=sayhi,args=(‘egon‘,))
t.start()
t.join()
print(‘主线程‘)
print(t.is_alive())
‘‘‘
egon say hello
主线程
False
‘‘‘
Daemon Threads
Whether it is a process or a thread, follow: Guardian xxx will wait for the main xxx to be destroyed after the completion of the operation
It should be emphasized that the operation is not terminated
#1.对主进程来说,运行完毕指的是主进程代码运行完毕
#2.对主线程来说,运行完毕指的是主线程所在的进程内所有非守护线程统统运行完毕,
主线程才算运行完毕
Detailed Explanation:
#1 主进程在其代码结束后就已经算运行完毕了(守护进程在此时就被回收),然后主进程会
一直等非守护的子进程都运行完毕后回收子进程的资源(否则会产生僵尸进程),才会结束,
#2 主线程在其他非守护线程运行完毕后才算运行完毕(守护线程在此时就被回收)。因为
主线程的结束意味着进程的结束,进程整体的资源都将被回收,而进程
必须保证非守护线程都运行完毕后才能结束。
from threading import Thread
import time
def sayhi(name):
time.sleep(2)
print(‘%s say hello‘ %name)
if __name__ == ‘__main__‘:
t=Thread(target=sayhi,args=(‘egon‘,))
t.setDaemon(True) #必须在t.start()之前设置
t.start()
print(‘主线程‘)
print(t.is_alive())
‘‘‘
主线程
True
‘‘‘
Python GIL (Global interpreter Lock)
In the CPython interpreter, multiple threads that are opened under the same process can only have one thread at a time and cannot take advantage of multicore advantages
The first thing to make clear is that the Gil is not a Python feature, it is a concept introduced when implementing the Python parser (CPython). Just like C + + is a set of language (syntax) standards, but can be compiled into executable code with different compilers. Well-known compilers such as Gcc,intel c++,visual C + +. Python is the same, and the same piece of code can be executed through different Python execution environments such as Cpython,pypy,psyco. Like the Jpython there is no Gil. However, because CPython is the default Python execution environment for most environments. So in a lot of people's concept CPython is Python, also take for granted the Gil to the Python language flaw. So let's be clear here: Gil is not a python feature, Python can be completely independent of the Gil
Gil is the essence of a mutex, since it is a mutex, all the nature of the mutex is the same, all the concurrent operation into serial, in order to control the same time shared data can only be modified by a task, and thus ensure data security.
One thing is certain: to protect the security of different data, you should add a different lock.
To understand the Gil, first make a point: each time you execute a python program, you create a separate process. For example, Python Test.py,python Aaa.py,python bbb.py will produce 3 different Python processes
In a python process, not only the main thread of the test.py or other threads opened by the thread, but also the interpreter-level thread of the interpreter-enabled garbage collection, in short, all threads are running within this process, without a doubt
If multiple threads are target=work, then the execution process is
Multiple lines enters upgradeable access to the interpreter's code, that is, get execute permission, and then give the target code to the interpreter code to execute
The code of the interpreter is shared by all threads, so the garbage collection thread can also access the interpreter's code to execute, which leads to a problem: for the same data 100, it is possible that thread 1 executes the x=100 while garbage collection performs the recovery of 100 operations, there is no clever way to solve this problem , is to lock processing, such as Gil, to ensure that the Python interpreter can only execute one task at a time code
Gil and lock
The Gil protects the data at the interpreter level and protects the user's own data by locking them up, such as
Gil and multithreading
With Gil's presence, at the same moment only one thread in the same process is executed
Heard here, some friends immediately questioned: The process can take advantage of multicore, but the overhead, and Python's multithreaded overhead, but can not take advantage of multicore advantage, that is, Python is useless, PHP is the most awesome language?
Don't worry, it's not finished yet.
To solve this problem, we need to agree on several points:
#1. cpu到底是用来做计算的,还是用来做I/O的?
#2. 多cpu,意味着可以有多个核并行完成计算,所以多核提升的是计算性能
#3. 每个cpu一旦遇到I/O阻塞,仍然需要等待,所以多核对I/O操作没什么用处
A worker is equivalent to the CPU, at this time the calculation is equivalent to workers in the work, I/O blocking is equivalent to work for workers to provide the necessary raw materials, workers work in the process if there is no raw materials, the workers need to work to stop the process until the arrival of raw materials.
If your factory is doing most of the tasks of preparing raw materials (I/O intensive), then you have more workers, the meaning is not enough, it is not as much as a person, in the course of materials to let workers to do other work,
Conversely, if your plant has a full range of raw materials, the more workers it is, the more efficient it is.
Conclusion:
For computing, the more CPU, the better, but for I/O, no more CPU is useless
Of course, to run a program, with the increase in CPU performance will certainly be improved (regardless of the increase in size, there is always improved), this is because a program is basically not pure computing or pure I/O, so we can only compare to see whether a program is computationally intensive or I/o-intensive, Further analysis of the Python multithreading in the end there is no useful
#分析:
我们有四个任务需要处理,处理方式肯定是要玩出并发的效果,解决方案可以是:
方案一:开启四个进程
方案二:一个进程下,开启四个线程
#单核情况下,分析结果:
如果四个任务是计算密集型,没有多核来并行计算,方案一徒增了创建进程的开销,方案二胜
如果四个任务是I/O密集型,方案一创建进程的开销大,且进程的切换速度远不如线程,方案二胜
#多核情况下,分析结果:
如果四个任务是计算密集型,多核意味着并行计算,在python中一个进程中同一时刻只有一个
线程执行用不上多核,方案一胜
如果四个任务是I/O密集型,再多的核也解决不了I/O问题,方案二胜
#结论:现在的计算机基本上都是多核,python对于计算密集型的任务开多线程的效率并不能带来多大性
能上的提升,甚至不如串行(没有大量切换),但是,对于IO密集型的任务效率还是有显著提升的。
Multithreaded Performance Testing
Computationally intensive: high-efficiency multi-process
from multiprocessing import Process
from threading import Thread
import os,time
def work():
res=0
for i in range(100000000):
res*=i
if __name__ == ‘__main__‘:
l=[]
print(os.cpu_count()) #本机为4核
start=time.time()
for i in range(4):
p=Process(target=work) #耗时5s多
p=Thread(target=work) #耗时18s多
l.append(p)
p.start()
for p in l:
p.join()
stop=time.time()
print(‘run time is %s‘ %(stop-start))
I/O intensive: high-efficiency multithreading
from multiprocessing import Process
from threading import Thread
import threading
import os,time
def work():
time.sleep(2)
print(‘===>‘)
if __name__ == ‘__main__‘:
l=[]
print(os.cpu_count()) #本机为4核
start=time.time()
for i in range(400):
# p=Process(target=work) #耗时12s多,大部分时间耗费在创建进程上
p=Thread(target=work) #耗时2s多
l.append(p)
p.start()
for p in l:
p.join()
stop=time.time()
print(‘run time is %s‘ %(stop-start))
Application:
Multithreading for IO-intensive, such as sockets, crawlers, web
Multi-process for computational-intensive, such as financial analysis
Identify the QR code in the chart, and welcome to the Python Treasure Book
Python Multithreading (two)