This is a creation in Article, where the information may have evolved or changed.
Note that the forwarded article time is 2011, time is older, so the data is only for reference. Can reflect the advantages of Golang.
The original is here: http://en.munknex.net/2011/12/golang-goroutines-performance.html
———————— – Translate split line ———————— –
Overview
In this article, I will try to evaluate the performance of Goroutine. Goroutine is something like a lightweight thread. In order to provide native multitasking, it (together with the channel) is built into go.
The documentation tells us:
It actually creates hundreds or thousands of goroutine in the same address space.
Therefore, the focus of this article is to test and identify the maximum performance pressure that can be sustained in such a large concurrency function.
Memory
The space required to create a new goroutine is not recorded in the document. Just say thousands of bytes is required. Tested under different mechanisms to help confirm that this value is 4-4.5kb. As a result, 5GB is almost enough to run 1 million goroutine.
Performance
Let's figure out how much performance it would cost to run a function in a goroutine. You may already know that this is very simple-just add the Go keyword before the function call:
Goroutine is reused for threads. By default, if the GOMAXPROCS environment variable is not set, the program uses only one thread. To take advantage of the full CPU core, you must set its value. For example:
This value is used at run time. Therefore, it is not necessary to recompile the program after each modification of the value.
In my inference, most of the time is spent creating goroutine, switching them, migrating from one thread goroutine to another, and communicating goroutine between different threads. To avoid endless discourse, let's start with the case of just one thread.
All the tests were done on my nettop: Intel's low-cost, simple desktop solution:
Method
This is the test function generator:
func genTest (n
int
) func (res chan <- interface {}) {
return
func(res chan <- interface {}) {
for
i := 0; i < n; i++ {
math.Sqrt(13)
}
res <- true
} } |
Then here is a series of function sets that calculate sqrt (13) 1, 10, 100, 1000, and 5,000 times separately:
Testfuncs: = [] Func (Chan <-Interface {}) {Gentest (1), Gentest (Ten), gentest (+), gentest (+), Gentest (500 0)} |
I perform x times in the loop for each function and then execute x times in the Goroutine. Then compare the results. Of course, you should pay attention to garbage collection. To reduce its impact, I explicitly called the runtime after the goroutine ended. GC () and record the end time. Of course, for testing accuracy, each test executes many times. The entire run time took about 16 hours.
A thread
The chart shows that the sqrt () calculation that runs in Goroutine is about four times times slower than running in a function.
Take a look at the remaining four functions:
You will notice that even if the concurrent execution of 700,000 goroutine does not reduce performance to less than 80%. Now is the most admirable place. Starting from sqrt () x1000, the overall consumption is less than 2%. 5,000 Times--only 1%. It seems that this value has nothing to do with the number of Goroutine! So the only limiting factor is memory.
Profile:
If the non-dependent code executes more than 10 times times the sum of the calculated squares, and you want it to execute concurrently, you should not hesitate to let it run in Goroutine. Although it's easy to put 10 or 100 of these codes together, the loss performance is only 20% and 2%, respectively.
Multithreading
Now let's look at what happens when we want to use a number of processor cores. In my use case is 2:
Execute our test program again:
Here you will find that even though the number of cores has increased by a factor, the first two functions have increased in time! This is likely to be much more expensive to move between threads than to execute them. :) The current scheduler is not yet processed, but the developers of Go are committed to solving this situation in the future.
As you've seen, the last two functions completely use two cores. On my nettop, their execution time is ~45µs and ~230µs, respectively.
Summarize
Although this is a young language and has a temporary scheduler implementation, Goroutine's performance is exciting. Especially when combined with the easy-to-use Go. That impressed me. Thanks to the Go development team!
When the running time is less than 1µs I will deliberate before executing goroutine, and if the running time is more than 1ms, then never hesitate to use Goroutine. :)