This is a creation in Article, where the information may have evolved or changed.
Go is a very real programming language that provides detailed information about the state of the operation from the very beginning. After the product on-line tuning and troubleshooting can rely on these status information. Here we summarize some of the status monitoring methods used in our project.
Pprof
Go comes with a pprof tool that can do CPU and memory Profiling, the official blog has an article about usage: "Profiling Go Programs"
The official article tells how to use the Pprof tool to analyze the data, but the portion of the analysis data is relatively single, I have been mistaken for a long time that cpuprof and Memprof must be opened at the start of the program, in fact, cpuprof and Memprof can be opened and closed online.
And the Pprof module is not just the two functions of cpuprof and Memprof, which also provides the lookup function for getting heap state information, thread state information, goroutine status information, and so on.
Here are the pprof features I used in my Project (Code snippets):
Case "Lookup Heap": P: = pprof. Lookup ("Heap") P.writeto (OS. Stdout,2) Case "Lookup Threadcreate": P: = pprof. Lookup ("Threadcreate") P.writeto (OS. Stdout,2) Case "Lookup Block": P: = pprof. Lookup ("Block") P.writeto (OS. Stdout,2) Case "Start Cpuprof":ifCpuprofile = =Nil{ifF, err: = OS. Create ("Game_server.cpuprof"); Err! =Nil{log. Printf ("Start CPU Profile failed:%v", err)}Else{log. Print ("Start CPU profile") Pprof. Startcpuprofile (f) cpuprofile = f}} Case "Stop Cpuprof":ifCpuprofile! =Nil{pprof. Stopcpuprofile () cpuprofile.close () Cpuprofile =NilLog. Print ("Stop CPU Profile") } Case "Get Memprof":ifF, err: = OS. Create ("Game_server.memprof"); Err! =Nil{log. Printf ("Record Memory profile failed:%v", err)}Else{Runtime. GC () pprof. Writeheapprofile (f) f.close () log. Print ("Record Memory Profile") }
Both "Lookup Goroutine" and "lookup heap" have helped me actually solve the problem. One time intranet test server because a functional logic into a deadlock, through "lookup Goroutine" to get to the current call stack information of all the running goroutine, you can quickly find out which goroutine call has been deadlocked.
The "Lookup heap" allows you to see how the heap is distributed, and can be quickly located in a memory leak. The lookup heap also provides the number of objects and pause times for each GC execution, which is useful for GC tuning of Go programs.
"Start Cpuprof" and "Start Memprof" can be dynamically enabled on-line cpuprof and memprof, here is a detail to note, my program at the beginning of the implementation of the daemon mode to do the background run, just add "start Cpurprof" , the online dynamic enable cpuprof will let the process stop responding, without daemon mode start will not, finally I can not have to remove their own implementation of daemon mode start, instead of Nohup let the program run in the background.
If you have a communication that implements Daemon mode operation, please note.
Cpuprof and Memprof There is also a detail to note, do Cpuprof program needs to give pprof to do analysis of the program is consistent, at least compile time to use the code and code path needs to be consistent, which is related to the debugging information in the program, Otherwise, the generated profile report is not allowed.
Gogctrace
Go provides some useful environment variables, you can let the program do not modify the code, do some of the runtime settings adjustments, such as Gomaxproc, which can be set through environment variables, or through code settings, I tend to use environment variable settings, more flexible.
Where Gogctrace environment variables play a key role in GC tuning, setting Gogctrace to 1,go program will output GC-related information each time GC.
The usage is similar to this:
GOGCTRACE=1 ./my_go_program 2>log_file
This is the syntax supported by the Linux shell, so setting the environment variable will only work on the process that is currently started.
The information is output to a standard error, so you need to redirect the output to the file with 2>.
The content of the output looks like this:
GC16 (8): the+6+5 Ms, 367 ->365 MB 817253 ->782045 (18216892-17434847) Objects, (2182) Handoff, (22022) Steal, 553/244/51 yields
Where GC16 for the 16th time GC, the following (8) is said to be executed by 8 threads, the number of threads corresponding to Gomaxprocs environment variables, 34+6+5 MS represents a series of GC actions consumed time, these three times add up 45ms, This is the time the program was paused during this GC.
Apiprof
Apiprof is not the function of go, but I do it myself in the project, which allows me to observe the operation of all the communication interfaces of the program in real time.
Apiprof monitors the execution time of all communication interfaces, sending a message to the APIPROF process with each request processing, including the type of the request and the execution time of the request.
The APIPROF process summarizes the data sent from all requests, doing further statistics, such as the average execution time of a request, the maximum execution time, and then outputting it to a table, sorted by the request execution time, so that it is easy to catch the program's performance bottleneck.
The total request execution time for our game now is around 30 microseconds, which is viewed from a single request type, which is about 200 microseconds higher than the time-consuming request, and the rest is mostly in dozens of microseconds. This data can be a reference to the students who are using go to develop the game.
According to personal experience, it is recommended to keep the request time in the microsecond level, to the millisecond level to improve vigilance to find ways to optimize, to dozens of microseconds should have a great space for optimization. Of course, these data are based on the project type and real-time requirements, if it is a distributed system, the communication between the nodes will take a few milliseconds, if the request processing time requirements in the microsecond level is not practical.