本章包括:
· Creating and Manipulating Threads
· Synchronizing Threads
· Summary
He draweth out the thread of his verbosity finer than the staple of his argument.
William Shakespeare, Love's Labours Lost, act 5, scene 1
Threads are sometimes called lightweight processes.
They are nothing more than a way to achieve concurrency without all the
overhead of switching tasks at the operating system level. (Of course,
the computing community isn't in perfect agreement about the definition
of threads; but we won't go into that.)
Ruby的線程是使用者級線程並依賴於作業系統。它們工作在DOS和Unix上。它們會完成一定的任務,然而,它也會隨作業系統變化。
例如,彼此獨立的代碼片斷被分離的情況下,線程很有用。當一個應用程式花太多的時間等待一個事件時,線程是有用的。通常,一個線程是等待時,另一個線程可能再做更有用的處理。
換句話說,使用線程有潛在的優勢。速度降低的相反是好處。同樣,在連續訪問資源的情況下,線程是沒有協助的。最後,當同步訪問全域資源比其它更重要時,使用線程。
基於這些和其它原因,有些作者宣稱,使用線程的程式是可避免的。真正的同步代碼可能很複雜,傾向於錯誤,和調試困難。但我們將給你保留它,以在適當的時間使用這些技術。
The difficulties
associated with unsynchronized threads are well-known. Global data can
be corrupted by threads attempting simultaneous access to those data.
Race conditions can occur wherein one thread makes some assumption
about what another has done already; these commonly result in
"non-deterministic" code that might run differently with each
execution. Finally, there is the danger of deadlock, wherein no thread
can continue because it is waiting for a resource held by some other
thread that is also blocked. Code written to avoid these problems is
referred to as thread-safe code.
Not
all of Ruby is thread-safe, but synchronization methods are available
that will enable you to control access to variables and resources,
protect critical sections of code, and avoid deadlock. We will deal
with these techniques in this chapter and give code fragments to
illustrate them.
一、建立和管理線程
線程的大多數基本操作包括建立線程,向內和向外傳遞資訊,停止線程,等等。我們也可以得到一個線程的列表,檢查線程的狀態,檢查其它各種資訊。我們在這兒給出基本操作的瀏覽。
1、建立線程
建立線程很容易。我們調用new方法並附加一個將做為線程體的塊。
thread = Thread.new do
# Statements comprising
# the thread...
end
很明顯傳回值是Tread類型的對象,它被用於主線程來控制它建立的線程。
如果我們想傳遞參數給線程呢?我們可通過傳遞參數給Thread.new來做,它將它們傳遞給塊。
a = 4
b = 5
c = 6
thread2 = Thread.new(a,b,c) do |a, x, y|
# Manipulate a, x, and y as needed.
end
# Note that if a is changed in the new thread, it will
# change suddenly and without warning in the main
# thread.
對任何其它的塊參數是類似的,與現在變數相符的任何一個,將被有效地傳給與之相同變數。前面片斷中的變數a在這種情況下是個危險的變數,就像注釋指出的。
線程也可在它們被建立的範圍內訪問變數。很明顯,不是同步的,這可能是個問題。主線程和一個或多個其它線程可能彼此獨立地修改這個變數,結果可能是不可預測的。
x = 1
y = 2
thread3 = Thread.new do
# This thread can manipulate x and y from the outer scope,
# but this is not always safe.
sleep(rand(0)) # Sleep a random fraction of a second.
x = 3
end
sleep(rand(0))
puts x
# Running this code repeatedly, x may be 1 or 3 when it
# is printed here!
方法fork是new的別名;這個名字來自於著名Unix系統下的同名調用。
2、Accessing Thread-local Variables
我們知道一個線程使用它範圍外的變數是危險的;我們也知道線程有它自己的局部資料。但是線程如何能讓它自己的資料成為公用的呢?
有用於這處目的的一個特殊機制。如果一個線程對象被看成一個雜湊表,執行緒區域資料可以被從那個線程對象的範圍的任何地方訪問。我們不是說真正的局部資料可以用這種方式來訪問, but only that we have access to named data on a per-thread basis.
還有個方法叫key?,它將告訴我們這個線程內是否使用了給定的名字。
線上程內,我們必須使用像雜湊表式的方式來引用資料。使用Threa.current將使做些來容易一些。
thread = Thread.new do
t = Thread.current
t[:var1] = "This is a string"
t[:var2] = 365
end
# Access the thread-local data from outside...
x = thread[:var1] # "This is a string"
y = thread[:var2] # 365
has_var2 = thread.key?("var2") # true
has_var3 = thread.key?("var3") # false
注意這些資料可從其它線程訪問,即使是在擁有它們的線程已經死掉以後(像這個例子)。
除了符號之外(像你剛看到的),我們也可以使用字串來標識線程的本地變數。
thread = Thread.new do
t = Thread.current
t["var3"] = 25
t[:var4] = "foobar"
end
a = thread[:var3] = 25
b = thread["var4"] = "foobar"
不要將這些特殊名字與真正的局部變數搞混了。
thread = Thread.new do
t = Thread.current
t["var3"] = 25
t[:var4] = "foobar"
var3 = 99 # True local variables (not
var4 = "zorch" # accessible from outside)
end
a = thread[:var3] # 25
b = thread["var4"] # "foobar"
最後,注意一個對真正局部變數的對象引用可以被用做線程內的速記形式的縮寫。這是真實的,只要你小心保護同名對象引用而不是建立一個新的。
thread = Thread.new do
t = Thread.current
x = "nXxeQPdMdxiBAxh"
t[:my_message] = x
x.reverse!
x.delete! "x"
x.gsub!(/[A-Z]/,"")
# On the other hand, assignment would create a new
# object and make this shorthand useless...
end
a = thread[:my_message] # "hidden"
同樣,這個縮寫很明顯不會工作,在你處理如Fixnums這樣的值時,是因它被儲存為直接值而不是對象引用。
3、Querying and Changing Thread Status
Thread類有幾個服務於各種目的的類方法。List方法返回所有活動線程的列表,main方法返回可產出其它線程的主線程的引用,current方法允許線程找出它自己的標識。
t1 = Thread.new { sleep 100 }
t2 = Thread.new do
if Thread.current == Thread.main
puts "This is the main thread." # Does NOT print
end
1.upto(1000)
sleep 0.1
end
end
count = Thread.list.size # 3
if Thread.list.include?(Thread.main)
puts "Main thread is alive." # Always prints!
end
if Thread.current == Thread.main
puts "I'm the main thread." # Prints here...
end
exit, pass, start, stop, 和 kill方法被用於控制線程的運行(通常從內部或外部)。
# In the main thread...
Thread.kill(t1) # Kill this thread now
Thread.pass(t2) # Pass execution to t2 now
t3 = Thread.new do
sleep 20
Thread.exit # Exit the thread
puts "Can't happen!" # Never reached
end
Thread.kill(t2) # Now kill t2
# Now exit the main thread (killing any others)
Thread.exit
注意沒有執行個體方法stop,所以一個線程可以停止自己但不能停止其它線程。
還有各種方法用於檢查線程的狀態。執行個體方法alive?將說出線程是否還活著(不是存在),stop?將說出線程是否是被停止狀態。
count = 0
t1 = Thread.new { loop { count += 1 } }
t2 = Thread.new { Thread.stop }
sleep 1
flags = [t1.alive?, # true
t1.stop?, # false
t2.alive?, # true
t2.stop?] # true
線程的狀況可以使用status方法看到。如果線程當前在運行中,傳回值是"run";如果它被停止,睡眠,或在I/O上等待則是"sleep";如果被異常中止了則是nil。
t1 = Thread.new { loop {} }
t2 = Thread.new { sleep 5 }
t3 = Thread.new { Thread.stop }
t4 = Thread.new { Thread.exit }
t5 = Thread.new { raise "exception" }
s1 = t1.status # "run"
s2 = t2.status # "sleep"
s3 = t3.status # "sleep"
s4 = t4.status # false
s5 = t5.status # nil
全域問題$SAFE可在不同的線程內被不同地設定。在這種情況下,它根本不是真正的全域變數;但我們不應該抱怨,因為這允許我們用不同的安全層級來運行線程。Saft_level方法將告訴我們運行中的線程的層級是什麼。
t1 = Thread.new { $SAFE = 1; sleep 5 }
t2 = Thread.new { $SAFE = 3; sleep 5 }
sleep 1
lev0 = Thread.main.safe_level # 0
lev1 = t1.safe_level # 1
lev2 = t2.safe_level # 3
線程的優先順序可能使用priority存取器來檢查和更改。
t1 = Thread.new { loop { sleep 1 } }
t2 = Thread.new { loop { sleep 1 } }
t2.priority = 3 # Set t2 at priority 3
p1 = t1.priority # 0
p2 = t2.priority # 3
高優先順序的線程會被更多地調度。
當線程想讓出控制給發送器時,可使用特殊方法pass。而線程只是讓出它的時間片;它不是真正地停止或睡眠。
t1 = Thread.new do
puts "alpha"
Thread.pass
puts "beta"
end
t2 = Thread.new do
puts "gamma"
puts "delta"
end
t1.join
t2.join
在這個人為的例子中,當Thread.pass被調用時,我們得到的輸出次序是alpha gamma delta beta。沒有它,我們得到的次序是alpha beta gamma delta。當然,這個特點不應該被用於同步,但只用於節約時間片的指派。
被停止的線程可以使用run或wakeup方法來喚醒:
t1 = Thread.new do
Thread.stop
puts "There is an emerald here."
end
t2 = Thread.new do
Thread.stop
puts "You're at Y2."
end
sleep 1
t1.wakeup
t2.run
它們區別很微小。wakeup調用修改線程的狀態以便於它成為可啟動並執行,但不調度它來運行;換句話說,run喚醒線程並且調度它立即運行。
在特殊情況下,結果是t1在t2之前被喚醒,但t2首先得到調度,產生的輸出如下:
You're at Y2.
There is an emerald here.
當然,不要笨到企圖真正地用這個微小區別來同步。
raise執行個體方法在指定做為接收者的線程內引起一個異常。(這個調用不是必須線上程內發生的。)
factorial1000 = Thread.new do
begin
prod = 1
1.upto(1000) { |n| prod *= n }
puts "1000! = #{ prod} "
rescue
# Do nothing...
end
end
sleep 0.01 # Your mileage may vary.
if factorial1000.alive?
factorial1000.raise("Stop!")
puts "Calculation was interrupted!"
else
puts "Calculation was successful."
end
先前產生的線程試圖計算1000的階乘;如果沒有在100秒內完成,主線程將殺死它。所以,在相對緩慢的機器上,這個代碼片斷列印的資訊將是Calculation was interrupted!。關於線程內的rescue子句,很明顯我們可放置我們需要的任何代碼,就像對其它子句那樣。
4、Achieving a Rendezvous (and Capturing a Return Value)
有時候,主線程希望等待另一個線程的完成。執行個體方法join將完成這些。
t1 = Thread.new { do_something_long() }
do_something_brief()
t1.join # Wait for t1
注意如果主線程在另一個線程上等待,則join是必須的。否則,當主線程退出時,線程被殺死。例如,這個片斷在在尾部沒有join時,從不會給我們它的最後回答:
meaning_of_life = Thread.new do
puts "The answer is..."
sleep 10
puts 42
end
sleep 9
meaning_of_life.join
這兒有個有用的小文法。它將在每個活動的線程上除了主線程,調用join。(對任何線程即使是主線程,這有個錯誤,在它的自身上調用join)。
Thread.list.each { |t| t.join if t != Thread.main }
當然,在兩者都不是線程時,一個線程在另一個線程上做一個join是可行的。如果主線程和另一個線程試圖彼此來join,結果是死結;解譯器會察覺這種情況並退出程式。
thr = Thread.new { sleep 1; Thread.main.join }
thr.join # Deadlock results!
線程有關聯塊,並且塊可以傳回值。這意味著線程可以傳回值。value方法將暗中做join操作並等待線程的完成;然後,它返回線程內最後的計算運算式返回的值。
max = 10000
thr = Thread.new do
sum = 0
1.upto(max) { |i| sum += i }
sum
end
guess = (max*(max+1))/2
print "Formula is "
if guess == thr.value
puts "right."
else
puts "right."
end
5、Dealing with Exceptions
如果線上程內出現異常會發生什麼事?它被關掉,行為是可配置的。