這是一個建立於 的文章,其中的資訊可能已經有所發展或是發生改變。
測試代碼:
package mainimport ( "fmt" "runtime" "sync" "time")const COUNT = 1000000func bench1(ch chan int) time.Duration { t := time.Now() for i := 0; i < COUNT; i++ { ch <- i } var v int for i := 0; i < COUNT; i++ { v = <-ch } _ = v return time.Now().Sub(t)}func bench2(s []int) time.Duration { t := time.Now() for i := 0; i < COUNT; i++ { s[i] = i } var v int for i := 0; i < COUNT; i++ { v = s[i] } _ = v return time.Now().Sub(t)}func bench3(s []int, mutex *sync.Mutex) time.Duration { t := time.Now() for i := 0; i < COUNT; i++ { mutex.Lock() s[i] = i mutex.Unlock() } var v int for i := 0; i < COUNT; i++ { mutex.Lock() v = s[i] mutex.Unlock() } _ = v return time.Now().Sub(t)}func main() { runtime.GOMAXPROCS(runtime.NumCPU()) ch := make(chan int, COUNT) s := make([]int, COUNT) var mutex sync.Mutex fmt.Println("channel\tslice\tmutex_slice") for i := 0; i < 10; i++ { fmt.Printf("%v\t%v\t%v\n", bench1(ch), bench2(s), bench3(s, &mutex)) }}
測試環境CPU: i7-3770MEMORY: 32GOS: ubuntu12.04 x86_64GO VERSION: 1.0.3
輸出:
channel |
slice |
mutex_slice |
53.774ms |
1.735ms |
37.103ms |
52.978ms |
1.058ms |
36.928ms |
52.864ms |
1.058ms |
36.928ms |
53.337ms |
1.069ms |
38.073ms |
53.695ms |
1.055ms |
37.801ms |
53.45ms |
1.063ms |
37.683ms |
53.678ms |
1.161ms |
37.767ms |
3.568ms |
1.052ms |
37.792ms |
53.47ms |
1.06ms |
37.185ms |
52.78ms |
1.062ms |
36.899ms |
結論:在沒有競爭的情況下,緩衝通道比線程鎖稍慢,但執行時間是直接對數組讀寫的40~50倍。引申:對於之前的提到的記憶體 Clerk,或者其他類型的資源分派器,如果頻繁調用的話,還是限制在goroutine記憶體 Clerk更合適。應該盡量避免在goroutine間分配資源。當然,實際的效能調整應該基於profile定位效能瓶頸而不是單純的想象。