2011 Goroutine Performance Test

Source: Internet
Author: User
Tags arch linux
This is a creation in Article, where the information may have evolved or changed.

Note that the forwarded article time is 2011, time is older, so the data is only for reference. Can reflect the advantages of Golang.

The original is here: http://en.munknex.net/2011/12/golang-goroutines-performance.html

———————— – Translate split line ———————— –

Overview

In this article, I will try to evaluate the performance of Goroutine. Goroutine is something like a lightweight thread. In order to provide native multitasking, it (together with the channel) is built into go.

The documentation tells us:

It actually creates hundreds or thousands of goroutine in the same address space.

Therefore, the focus of this article is to test and identify the maximum performance pressure that can be sustained in such a large concurrency function.

Memory

The space required to create a new goroutine is not recorded in the document. Just say thousands of bytes is required. Tested under different mechanisms to help confirm that this value is 4-4.5kb. As a result, 5GB is almost enough to run 1 million goroutine.

Performance

Let's figure out how much performance it would cost to run a function in a goroutine. You may already know that this is very simple-just add the Go keyword before the function call:

go testFunc()

Goroutine is reused for threads. By default, if the GOMAXPROCS environment variable is not set, the program uses only one thread. To take advantage of the full CPU core, you must set its value. For example:

export             GOMAXPROCS=2

This value is used at run time. Therefore, it is not necessary to recompile the program after each modification of the value.

In my inference, most of the time is spent creating goroutine, switching them, migrating from one thread goroutine to another, and communicating goroutine between different threads. To avoid endless discourse, let's start with the case of just one thread.

All the tests were done on my nettop: Intel's low-cost, simple desktop solution:

    • Atom D525 Dual Core 1.8 GHz

    • 4Gb DDR3

    • Go r60.3

    • Arch Linux x86_64

Method

This is the test function generator:

func genTest (n int) func (res chan <- interface {}) {         returnfunc(res chan <- interface {}) {                 fori := 0; i < n; i++ {                         math.Sqrt(13)                 }                 res <- true         } }

Then here is a series of function sets that calculate sqrt (13) 1, 10, 100, 1000, and 5,000 times separately:

Testfuncs: = [] Func (Chan <-Interface {}) {Gentest (1), Gentest (Ten), gentest (+), gentest (+), Gentest (500 0)}

I perform x times in the loop for each function and then execute x times in the Goroutine. Then compare the results. Of course, you should pay attention to garbage collection. To reduce its impact, I explicitly called the runtime after the goroutine ended. GC () and record the end time. Of course, for testing accuracy, each test executes many times. The entire run time took about 16 hours.

A thread

export             GOMAXPROCS=1

The chart shows that the sqrt () calculation that runs in Goroutine is about four times times slower than running in a function.

Take a look at the remaining four functions:

You will notice that even if the concurrent execution of 700,000 goroutine does not reduce performance to less than 80%. Now is the most admirable place. Starting from sqrt () x1000, the overall consumption is less than 2%. 5,000 Times--only 1%. It seems that this value has nothing to do with the number of Goroutine! So the only limiting factor is memory.

Profile:

If the non-dependent code executes more than 10 times times the sum of the calculated squares, and you want it to execute concurrently, you should not hesitate to let it run in Goroutine. Although it's easy to put 10 or 100 of these codes together, the loss performance is only 20% and 2%, respectively.

Multithreading

Now let's look at what happens when we want to use a number of processor cores. In my use case is 2:

export             GOMAXPROCS=2

Execute our test program again:

Here you will find that even though the number of cores has increased by a factor, the first two functions have increased in time! This is likely to be much more expensive to move between threads than to execute them. :) The current scheduler is not yet processed, but the developers of Go are committed to solving this situation in the future.

As you've seen, the last two functions completely use two cores. On my nettop, their execution time is ~45&micro;s and ~230&micro;s, respectively.

Summarize

Although this is a young language and has a temporary scheduler implementation, Goroutine's performance is exciting. Especially when combined with the easy-to-use Go. That impressed me. Thanks to the Go development team!

When the running time is less than 1&micro;s I will deliberate before executing goroutine, and if the running time is more than 1ms, then never hesitate to use Goroutine. :)


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.