淺析Python多線程,淺析python
今天看了幾篇部落格,主要講解線程的執行個體以及如何避免線程間的競爭,覺得感覺對自己很有用,所以在此先寫先來以備以後自己查閱.
執行個體一:我們將要請求三個不同的url1.單線程:
1 import time 2 from urllib.request import urlopen 3 4 5 def get_responses(): 6 urls = [ 7 'http://www.baidu.com', 8 'http://www.taobao.com', 9 'http://www.alibaba.com',10 ]11 start = time.time()12 for url in urls:13 print(url)14 resp = urlopen(url)15 print(resp.getcode()) #得到狀態代碼16 print("spent time:%s" % (time.time()-start))17 18 get_responses()
解釋:
url順序的被請求
除非cpu從一個url獲得了回應,否則不會去請求下一個url
網路請求會花費較長的時間,所以cpu在等待網路請求的返回時間內一直處於閑置狀態。
輸出為:
http://www.baidu.com
200
http://www.taobao.com
200
http://www.alibaba.com
200
spent time:1.1927924156188965
2.多線程:
from urllib.request import urlopenimport timefrom threading import Threadclass GetUrlThread(Thread): def __init__(self, url): self.url = url super(GetUrlThread, self).__init__() def run(self): resp = urlopen(self.url) print(self.url, resp.getcode())def get_responses(): urls = [ 'http://www.baidu.com', 'http://www.taobao.com', 'http://www.alibaba.com', ] start = time.time() threads = [] for url in urls: t = GetUrlThread(url) threads.append(t) t.start() for t in threads: t.join() print("spent time:%s" % (time.time()-start))get_responses()
解釋:
意識到了程式在執行時間上的提升
我們寫了一個多線程程式來減少cpu的等待時間,當我們在等待一個線程內的網路請求返回時,這時cpu可以切換到其他線程去進行其他線程內的網路請求。
我們期望一個線程處理一個url,所以執行個體化線程類的時候我們傳了一個url。
線程運行意味著執行類裡的run()方法。
無論如何我們想每個線程必須執行run()。
為每個url建立一個線程並且調用start()方法,這告訴了cpu可以執行線程中的run()方法了。
我們希望所有的線程執行完畢的時候再計算花費的時間,所以調用了join()方法。
join()可以通知主線程等待這個線程結束後,才可以執行下一條指令。
每個線程我們都調用了join()方法,所以我們是在所有線程執行完畢後計算的已耗用時間。
關於線程:
cpu可能不會在調用start()後馬上執行run()方法。
你不能確定run()在不同線程建間的執行順序。
對於單獨的一個線程,可以保證run()方法裡的語句是按照順序執行的。
這就是因為線程內的url會首先被請求,然後列印出返回的結果。
輸出為:
http://www.baidu.com 200
http://www.alibaba.com 200
http://www.taobao.com 200
spent time:0.6294200420379639
執行個體二:全域變數的安全執行緒問題(race condition)1.BUG版:
from threading import Threadimport time#define a global variablesome_var = 0class IncrementThread(Thread): def run(self): # we want to read a global variable # and then increment it global some_var read_var = some_var print("some_var in %s is %d" % (self.name, read_var)) time.sleep(0.1) some_var = read_var + 1 print("some_var in %s is %d" % (self.name, some_var))def use_increment_thread(): threads = [] for i in range(50): t = IncrementThread() threads.append(t) t.start() for t in threads: t.join() print("After 50 modifications, some_var should have become 50") print("After 50 modifications, some_var is %d" % some_var)use_increment_thread()
解釋:
有一個全域變數,所有的線程都想修改它。
所有的線程應該在這個全域變數上加 1 。
有50個線程,最後這個數值應該變成50,但是它卻沒有。
為什麼沒有達到50?
在some_var是15的時候,線程t1讀取了some_var,這個時刻cpu將控制權給了另一個線程t2。
t2線程讀到的some_var也是15
t1和t2都把some_var加到16
當時我們期望的是t1 t2兩個線程使some_var + 2變成17
在這裡就有了資源競爭。
相同的情況也可能發生在其它的線程間,所以出現了最後的結果小於50的情況。
輸出為:
some_var in Thread-1 is 0
some_var in Thread-2 is 0
some_var in Thread-3 is 0
some_var in Thread-4 is 0
some_var in Thread-5 is 0
some_var in Thread-6 is 0
some_var in Thread-7 is 0
some_var in Thread-8 is 0
some_var in Thread-9 is 0
some_var in Thread-10 is 0
some_var in Thread-11 is 0
some_var in Thread-12 is 0
some_var in Thread-13 is 0
some_var in Thread-14 is 0
some_var in Thread-15 is 0
some_var in Thread-16 is 0
some_var in Thread-17 is 0
some_var in Thread-18 is 0
some_var in Thread-19 is 0
some_var in Thread-20 is 0
some_var in Thread-21 is 0
some_var in Thread-22 is 0
some_var in Thread-23 is 0
some_var in Thread-24 is 0
some_var in Thread-25 is 0
some_var in Thread-26 is 0
some_var in Thread-27 is 0
some_var in Thread-28 is 0
some_var in Thread-29 is 0
some_var in Thread-30 is 0
some_var in Thread-31 is 0
some_var in Thread-32 is 0
some_var in Thread-33 is 0
some_var in Thread-34 is 0
some_var in Thread-35 is 0
some_var in Thread-36 is 0
some_var in Thread-37 is 0
some_var in Thread-38 is 0
some_var in Thread-39 is 0
some_var in Thread-40 is 0
some_var in Thread-41 is 0
some_var in Thread-42 is 0
some_var in Thread-43 is 0
some_var in Thread-44 is 0
some_var in Thread-45 is 0
some_var in Thread-46 is 0
some_var in Thread-47 is 0
some_var in Thread-48 is 0
some_var in Thread-49 is 0
some_var in Thread-50 is 0
some_var in Thread-6 is 1
some_var in Thread-5 is 1
some_var in Thread-2 is 1
some_var in Thread-4 is 1
some_var in Thread-1 is 1
some_var in Thread-3 is 1
some_var in Thread-12 is 1
some_var in Thread-13 is 1
some_var in Thread-11 is 1
some_var in Thread-10 is 1
some_var in Thread-9 is 1
some_var in Thread-7 is 1
some_var in Thread-8 is 1
some_var in Thread-21 is 1
some_var in Thread-20 is 1
some_var in Thread-19 is 1
some_var in Thread-18 is 1
some_var in Thread-17 is 1
some_var in Thread-15 is 1
some_var in Thread-14 is 1
some_var in Thread-16 is 1
some_var in Thread-26 is 1
some_var in Thread-25 is 1
some_var in Thread-24 is 1
some_var in Thread-22 is 1
some_var in Thread-23 is 1
some_var in Thread-31 is 1
some_var in Thread-29 is 1
some_var in Thread-28 is 1
some_var in Thread-27 is 1
some_var in Thread-30 is 1
some_var in Thread-38 is 1
some_var in Thread-37 is 1
some_var in Thread-36 is 1
some_var in Thread-35 is 1
some_var in Thread-32 is 1
some_var in Thread-33 is 1
some_var in Thread-34 is 1
some_var in Thread-44 is 1
some_var in Thread-43 is 1
some_var in Thread-42 is 1
some_var in Thread-41 is 1
some_var in Thread-40 is 1
some_var in Thread-39 is 1
some_var in Thread-50 is 1
some_var in Thread-49 is 1
some_var in Thread-48 is 1
some_var in Thread-47 is 1
some_var in Thread-45 is 1
some_var in Thread-46 is 1
After 50 modifications, some_var should have become 50
After 50 modifications, some_var is 1
解決競爭帶鎖版:
1 from threading import Lock, Thread 2 import time 3 lock = Lock() 4 some_var = 0 5 6 class IncrementThread(Thread): 7 def run(self): 8 #we want to read a global variable 9 #and then increment it10 global some_var11 lock.acquire()12 read_value = some_var13 print("some_var in %s is %d" % (self.name, read_value))14 time.sleep(0.1)15 some_var = read_value + 116 print("some_var in %s after increment is %d" % (self.name, some_var))17 lock.release()18 19 def use_increment_thread():20 threads = []21 for i in range(50):22 t = IncrementThread()23 threads.append(t)24 t.start()25 for t in threads:26 t.join()27 print("After 50 modifications, some_var should have become 50")28 print("After 50 modifications, some_var is %d" % (some_var,))29 30 use_increment_thread()
解釋:
Lock 用來防止競爭條件
如果在執行一些操作之前,線程t1獲得了鎖。其他的線程在t1釋放Lock之前,不會執行相同的操作
我們想要確定的是一旦線程t1已經讀取了some_var,直到t1完成了修改some_var,其他的線程才可以讀取some_var
這樣讀取和修改some_var成了邏輯上的原子操作。
輸出為:
some_var in Thread-1 is 0
some_var in Thread-1 after increment is 1
some_var in Thread-2 is 1
some_var in Thread-2 after increment is 2
some_var in Thread-3 is 2
some_var in Thread-3 after increment is 3
some_var in Thread-4 is 3
some_var in Thread-4 after increment is 4
some_var in Thread-5 is 4
some_var in Thread-5 after increment is 5
some_var in Thread-6 is 5
some_var in Thread-6 after increment is 6
some_var in Thread-7 is 6
some_var in Thread-7 after increment is 7
some_var in Thread-8 is 7
some_var in Thread-8 after increment is 8
some_var in Thread-9 is 8
some_var in Thread-9 after increment is 9
some_var in Thread-10 is 9
some_var in Thread-10 after increment is 10
some_var in Thread-11 is 10
some_var in Thread-11 after increment is 11
some_var in Thread-12 is 11
some_var in Thread-12 after increment is 12
some_var in Thread-13 is 12
some_var in Thread-13 after increment is 13
some_var in Thread-14 is 13
some_var in Thread-14 after increment is 14
some_var in Thread-15 is 14
some_var in Thread-15 after increment is 15
some_var in Thread-16 is 15
some_var in Thread-16 after increment is 16
some_var in Thread-17 is 16
some_var in Thread-17 after increment is 17
some_var in Thread-18 is 17
some_var in Thread-18 after increment is 18
some_var in Thread-19 is 18
some_var in Thread-19 after increment is 19
some_var in Thread-20 is 19
some_var in Thread-20 after increment is 20
some_var in Thread-21 is 20
some_var in Thread-21 after increment is 21
some_var in Thread-22 is 21
some_var in Thread-22 after increment is 22
some_var in Thread-23 is 22
some_var in Thread-23 after increment is 23
some_var in Thread-24 is 23
some_var in Thread-24 after increment is 24
some_var in Thread-25 is 24
some_var in Thread-25 after increment is 25
some_var in Thread-26 is 25
some_var in Thread-26 after increment is 26
some_var in Thread-27 is 26
some_var in Thread-27 after increment is 27
some_var in Thread-28 is 27
some_var in Thread-28 after increment is 28
some_var in Thread-29 is 28
some_var in Thread-29 after increment is 29
some_var in Thread-30 is 29
some_var in Thread-30 after increment is 30
some_var in Thread-31 is 30
some_var in Thread-31 after increment is 31
some_var in Thread-32 is 31
some_var in Thread-32 after increment is 32
some_var in Thread-33 is 32
some_var in Thread-33 after increment is 33
some_var in Thread-34 is 33
some_var in Thread-34 after increment is 34
some_var in Thread-35 is 34
some_var in Thread-35 after increment is 35
some_var in Thread-36 is 35
some_var in Thread-36 after increment is 36
some_var in Thread-37 is 36
some_var in Thread-37 after increment is 37
some_var in Thread-38 is 37
some_var in Thread-38 after increment is 38
some_var in Thread-39 is 38
some_var in Thread-39 after increment is 39
some_var in Thread-40 is 39
some_var in Thread-40 after increment is 40
some_var in Thread-41 is 40
some_var in Thread-41 after increment is 41
some_var in Thread-42 is 41
some_var in Thread-42 after increment is 42
some_var in Thread-43 is 42
some_var in Thread-43 after increment is 43
some_var in Thread-44 is 43
some_var in Thread-44 after increment is 44
some_var in Thread-45 is 44
some_var in Thread-45 after increment is 45
some_var in Thread-46 is 45
some_var in Thread-46 after increment is 46
some_var in Thread-47 is 46
some_var in Thread-47 after increment is 47
some_var in Thread-48 is 47
some_var in Thread-48 after increment is 48
some_var in Thread-49 is 48
some_var in Thread-49 after increment is 49
some_var in Thread-50 is 49
some_var in Thread-50 after increment is 50
After 50 modifications, some_var should have become 50
After 50 modifications, some_var is 50
執行個體三:多線程環境下的原子操作BUG版本:
1 from threading import Thread 2 import time 3 4 class CreateListThread(Thread): 5 def run(self): 6 self.entries = [] 7 for i in range(10): 8 # time.sleep(0.1) 9 self.entries.append(i)10 for each in self.entries:11 print(each, end = " ")12 time.sleep(0.1)13 14 def use_create_list_thread():15 for i in range(3):16 t = CreateListThread()17 t.start()18 19 use_create_list_thread()
解釋:
當一個線程正在列印的時候,cpu切換到了另一個線程,所以產生了不正確的結果。我們需要確保print self.entries是個邏輯上的原子操作,以防列印時被其他線程打斷。
因為列印的速度太快,我在此有意放大了這個時間,加了一個time.sleep(0.1)
輸出為:
0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9
2.加鎖保證操作的原子性
1 from threading import Thread, Lock 2 import time 3 4 lock = Lock() 5 6 7 class CreateListThread(Thread): 8 def run(self): 9 self.entries = []10 for i in range(10):11 time.sleep(0.1)12 self.entries.append(i)13 lock.acquire()14 for each in self.entries:15 print(each, end = " ")16 time.sleep(0.1)17 lock.release()18 19 20 def use_create_list_thread():21 for i in range(3):22 t = CreateListThread()23 t.start()24 25 use_create_list_thread()
輸出為:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9