這是一個建立於 的文章,其中的資訊可能已經有所發展或是發生改變。
一個即將上線的go 寫的高頻服務,壓測的時候發現 gc 特別高,高到10%-15% 左右了,本文記錄下最佳化 gc 的過程和和思路。線上環境1.10.
首先,查看gc 是否有異常,我們可以使用 gctrace 跟蹤即時的gc 。執行下面命令可以看到gc 的即時資訊。
GODEBUG=gctrace=1 go run cmd/agent_bin.go
輸出結果如下:
gc 45 @37.801s 11%: 0.19+627+0.29 ms clock, 0.38+424/621/0+0.59 ms cpu, 356->415->225 MB, 453 MB goal, 4 Pgc 46 @39.126s 11%: 2.9+927+0.16 ms clock, 5.8+342/925/0+0.33 ms cpu, 361->460->275 MB, 450 MB goal, 4 Pgc 47 @40.847s 12%: 0.24+1096+0.12 ms clock, 0.49+291/1007/0+0.24 ms cpu, 427->559->319 MB, 551 MB goal, 4 Pgc 48 @42.771s 12%: 0.26+841+0.12 ms clock, 0.52+377/830/0+0.24 ms cpu, 486->561->271 MB, 638 MB goal, 4 Pgc 49 @44.429s 12%: 3.1+890+0.40 ms clock, 6.2+492/833/0+0.81 ms cpu, 440->528->294 MB, 543 MB goal, 4 Pgc 50 @46.188s 12%: 0.23+1165+0.13 ms clock, 0.47+624/1158/0+0.27 ms cpu, 471->579->323 MB, 589 MB goal, 4 Pgc 51 @48.252s 13%: 0.26+1410+0.14 ms clock, 0.52+358/1336/9.9+0.28 ms cpu, 506->620->343 MB, 646 MB goal, 4 Pgc 52 @50.942s 13%: 0.27+806+0.51 ms clock, 0.55+403/805/200+1.0 ms cpu, 549->657->340 MB, 687 MB goal, 4 Pgc 53 @53.014s 13%: 0.10+857+0.36 ms clock, 0.21+467/851/94+0.73 ms cpu, 546->666->351 MB, 681 MB goal, 4 P
gc 45:表示第45次GC,共有4個P (線程)參與GC。
11%: 表示gc 占時間比。
0.19+627+0.29 us:STW(stop-the-world)0.19ms, 並發標記和掃描的時間627ms, STW標記的時間0.29ms。
0.38+424/621/0+0.59 ms cpu, 表示記憶體回收佔用cpu時間
356->415->225 MB, 453 MB goal,表示堆的大小,gc後堆的大小,存活堆的大小
453 MB goal 表示整體堆的大小為435M。
根據官方描述,golang1.0 的gc 可以降到100ms 以內,但是這裡gc 都超過1s了,這明顯是不可以接受的,說明gc 是有很大異常的。
檢查思路,首先利用pprof 打出整個調用過程累計的堆分配圖,查出到底是哪些模組堆分配異常。通過代碼內嵌pprof 暴露連接埠的方式,終端輸出svg。
import _ "net/http/pprof"go func() { http.ListenAndServe("0.0.0.0:8080", nil)}()
然後終端輸入:
go tool pprof -alloc_space -cum -svg http://127.0.0.1:8080/debug/pprof/heap > heap.svg
然後瀏覽器開啟svg,找到消耗最大的調用棧:
使用-cum,是累計函數調用棧的堆分配大小,因為圖形會將調用棧很大的路線著重標識出來,這裡可以明顯看出是自己實現的Fetch 函數的io copyBuffer 函數gc 壓力最大。這裡大致知道來源了,那到底是我們函數哪個對象有問題?
接下來我們對這個檔案使用逃逸分析:直接用go build --gcflags=-m 接檔案即可:
我們的源碼是這樣的:
package httpclientimport ("io""io/ioutil""net/http""net/url""os""sync""time""refresh_agent/utils")var client *http.Clientvar buffPool sync.Poolfunc init() {client = &http.Client{}http.DefaultTransport.(*http.Transport).MaxIdleConnsPerHost = 2000http.DefaultTransport.(*http.Transport).MaxIdleConns = 20000}type HttpClient struct{}func NewHttpClient() *HttpClient {httpClient := HttpClient{}return &httpClient}func (this *HttpClient) replaceUrl(srcUrl string, ip string) string {httpPrefix := "http://"parsedUrl, err := url.Parse(srcUrl)if err != nil {return ""}return httpPrefix + ip + parsedUrl.Path}func (this *HttpClient) downLoadFile(resp *http.Response) error {out, err := os.OpenFile("/dev/null", os.O_RDWR|os.O_CREATE|os.O_APPEND, 0666)defer out.Close()_, err = io.Copy(out, resp.Body)return err}func (this *HttpClient) Fetch(dstUrl string, method string, proxyHost string, header map[string]string, preload bool, timeout int64) (*http.Response, error) {// proxyHost 換掉 url 中請求newUrl := this.replaceUrl(dstUrl, proxyHost)req, _ := http.NewRequest(method, newUrl, nil)for k, v := range header {req.Header.Add(k, v)}req.Host = utils.GetUrlHost(dstUrl)client.Timeout = time.Duration(timeout) * time.Secondresp, err := client.Do(req)if resp == nil || err != nil {return resp, err}if preload {err := this.downLoadFile(resp)if err != nil {return nil, err}}io.Copy(ioutil.Discard, resp.Body)resp.Body.Close()return resp, nil}
逃逸分析的結果如下:
go build --gcflags=-m libs/httpclient/httpclient.go# refresh_agent/vendor/github.com/garyburd/redigo/redisgc 1 @0.078s 5%: 0.075+24+0.040 ms clock, 0.15+0.15/11/17+0.081 ms cpu, 16->16->11 MB, 42 MB goal, 2 P# refresh_agent/vendor/github.com/imroc/reqgc 1 @0.071s 3%: 0.017+26+0.050 ms clock, 0.034+0.22/6.0/22+0.10 ms cpu, 16->17->11 MB, 42 MB goal, 2 P# refresh_agent/vendor/github.com/BurntSushi/tomlgc 1 @0.054s 6%: 0.086+43+0.062 ms clock, 0.17+0.14/12/14+0.12 ms cpu, 16->17->12 MB, 42 MB goal, 2 P# refresh_agent/vendor/github.com/modern-go/reflect2gc 1 @0.054s 7%: 0.070+28+0.051 ms clock, 0.14+0.15/12/14+0.10 ms cpu, 16->17->10 MB, 42 MB goal, 2 Pgc 2 @0.334s 3%: 0.016+33+0.055 ms clock, 0.032+0.54/12/20+0.11 ms cpu, 37->38->14 MB, 54 MB goal, 2 P# refresh_agent/vendor/github.com/json-iterator/gogc 1 @0.045s 5%: 0.068+27+0.040 ms clock, 0.13+2.5/4.7/16+0.081 ms cpu, 16->16->11 MB, 42 MB goal, 2 Pgc 2 @0.211s 9%: 0.029+107+0.12 ms clock, 0.058+0.96/52/0+0.24 ms cpu, 39->44->26 MB, 58 MB goal, 2 P# command-line-argumentslibs/httpclient/httpclient.go:18:6: can inline init.0libs/httpclient/httpclient.go:26:6: can inline NewHttpClientlibs/httpclient/httpclient.go:19:24: &http.Client literal escapes to heaplibs/httpclient/httpclient.go:28:9: &httpClient escapes to heaplibs/httpclient/httpclient.go:27:2: moved to heap: httpClientlibs/httpclient/httpclient.go:31:62: leaking param: srcUrllibs/httpclient/httpclient.go:37:25: httpPrefix + ip + parsedUrl.Path escapes to heaplibs/httpclient/httpclient.go:31:62: (*HttpClient).replaceUrl this does not escapelibs/httpclient/httpclient.go:31:62: (*HttpClient).replaceUrl ip does not escapelibs/httpclient/httpclient.go:44:18: out escapes to heaplibs/httpclient/httpclient.go:44:28: resp.Body escapes to heaplibs/httpclient/httpclient.go:41:59: leaking param content: resplibs/httpclient/httpclient.go:41:59: (*HttpClient).downLoadFile this does not escapelibs/httpclient/httpclient.go:48:151: leaking param: dstUrllibs/httpclient/httpclient.go:48:151: leaking param: methodlibs/httpclient/httpclient.go:48:151: leaking param content: headerlibs/httpclient/httpclient.go:69:30: resp.Body escapes to heaplibs/httpclient/httpclient.go:48:151: (*HttpClient).Fetch this does not escapelibs/httpclient/httpclient.go:48:151: (*HttpClient).Fetch proxyHost does not escape<autogenerated>:1:0: leaking param: .anon0<autogenerated>:1:0: leaking param: .this<autogenerated>:1:0: leaking param: io.p<autogenerated>:1:0: leaking param: .this<autogenerated>:1:0: os.(*File).close .this does not escape<autogenerated>:1:0: leaking param: .this<autogenerated>:1:0: leaking param: .this<autogenerated>:1:0: leaking param: .this<autogenerated>:1:0: leaking param: context.key<autogenerated>:1:0: leaking param: .this
這裡我們發現resp.Body 逃逸到了堆上。和我們原始的希望是不一致的,我們的目的,是希望將resp.body 重新導向到空裝置中去,這裡的對象最好是分配到棧區。這裡看不出,到底哪一步反覆建立對象了。
繼續追io.CopyBuffer,查看io.Copy 的源碼:
func Copy(dst Writer, src Reader) (written int64, err error) {return copyBuffer(dst, src, nil)}// CopyBuffer is identical to Copy except that it stages through the// provided buffer (if one is required) rather than allocating a// temporary one. If buf is nil, one is allocated; otherwise if it has// zero length, CopyBuffer panics.func CopyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) {if buf != nil && len(buf) == 0 {panic("empty buffer in io.CopyBuffer")}return copyBuffer(dst, src, buf)}func copyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) {// If the reader has a WriteTo method, use it to do the copy.// Avoids an allocation and a copy.if wt, ok := src.(WriterTo); ok {return wt.WriteTo(dst)}// Similarly, if the writer has a ReadFrom method, use it to do the copy.if rt, ok := dst.(ReaderFrom); ok {return rt.ReadFrom(src)}size := 32 * 1024if l, ok := src.(*LimitedReader); ok && int64(size) > l.N {if l.N < 1 {size = 1} else {size = int(l.N)}}if buf == nil {buf = make([]byte, size) #這裡反覆建立 大小為32k的slice}
看到調用棧提示的函數,發現,我們調用copy 的時候,如果不傳入buffer, 內部會反覆建立這個buf。基本就定位到問題了。
那麼我們知道某個對象會一直被gc 回收,怎麼處理呢,通用做法是,使用對象池。
經過改造,我們copy 的時候傳一個buffer 進去,這個buffer 從sync.Pool 中去取:
var buffPool sync.Poolfunc (this *HttpClient) downLoadFile(resp *http.Response) error {//err write /dev/null: bad file descriptor#out, err := os.OpenFile("/dev/null", os.O_RDWR|os.O_CREATE|os.O_APPEND, 0666)defer out.Close()buf := buffPool.Get()if buf == nil {buf = make([]byte, 32*1024)}io.CopyBuffer(out, resp.Body, buf.([]byte))buffPool.Put(buf)return err
使用這種方式後,這裡的堆佔用大小立馬降下來了:
通過這樣的思路,將堆累計分配比較頻繁的一些點全部找出來後,gc降到了100ms 以下。
記錄幾個gc 最佳化的點, 高頻請求一定要注意:
1,函數盡量不要返回map, slice對象, 這種頻繁調用的函數會給gc 帶來壓力。
2,小對象要合并。
3,函數頻繁建立的簡單的對象,直接返回對象,效果比返回指標效果要好。
4,避不開,能用sync.Pool 就用,雖然有人說1.10 後不推薦使用sync.Pool,但是壓測來看,確實還是用效果,堆累計分配大小能減少一半以上。
5,類型轉換要注意,官方用法消耗特別大,推薦使用雨痕的方式。
6,避免反覆建立slice。