Tell me about Golang's runtime.

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed. The runtime contains the operation of the system interaction of the go runtime, such as the ability to control goruntine. There are also debug,pprof to troubleshoot and run-time performance analysis, tracer to crawl exception event information, such as goroutine creation, lock unlock state, system calls into the launch and lock as well as GC-related events, stack size changes and process exit and start events, etc. ; Race conduct the race relationship check and CGO implementation. In general, the runtime is the scheduler and GC, is also the main content of this article.


Scheduler


First of all, when we learn the operating system, we know that for the CPU time slice scheduling, is the system resource allocation strategy, such as task A after execution, choose which task to execute, so that a factor (such as process total execution time, or disk seek time, etc.) minimum, to achieve the best service. This is the problem of scheduling concern. So what is the scheduler of Go running? Why do we need it because we know that the OS kernel already has a thread (process) scheduler?

Why does go have to make a set of their own? Think about how often we say go, the language level is concurrency--why do we say that? To be willing is this, go has its own scheduler.

So much to say, why on Earth? We know that threads have their own signal masks, context, and a variety of control information, but many of these features are not of concern to the GO program itself, and the context switch is time-consuming and resource-intensive, and more importantly, the reason for GC, which is also explained in the next part of this article, Is that go garbage collection needs Stop the world, all the goroutine stop, in order to keep the memory in a consistent state. The time of the garbage collection is uncertain based on the memory situation, and if we don't have our own scheduler we give the OS our own scheduler, we lose control and there are a lot of threads that need to stop working. So go needs to develop a self-employed scheduler on its own, be able to manage goruntines on its own, and know when the memory state is consistent, that is, for the OS to run, just wait for the thread that is running on the CPU core, instead of waiting for all the threads.
Each go program comes with a runtime,runtime responsible for interacting with the underlying operating system, and there will be scheduler to dispatch the Goruntines. There are three very important concepts in scheduler: P,m,g.

View Source/src/runtime/proc.go We can see the comments:
// Goroutine scheduler// The scheduler's job is to distribute ready-to-run goroutines over worker threads.//// The main concepts are:// G - goroutine.// M - worker thread, or machine.// P - processor, a resource that is required to execute Go code.//     M must have an associated P to execute Go code, however it can be//     blocked or in a syscall w/o an associated P.//// Design doc at https://golang.org/s/go11sched.

We also look at the start process of the GO program:
// The bootstrap sequence is:////call osinit//call schedinit//make & queue new G//call runtime·mstart//// The new G calls runtime·main.

To understand the detailed process visible: Golang internals-genius0101-Blog Park
So what exactly did scheduler solve and how to manage goruntines?

Want to solve their own scheduling, avoid a problem that is the management of the stack, that is, each goroutine has its own stack, in the creation of goroutine, it is necessary to create the corresponding stack at the same time. Then the stack space will continue to grow when the goroutine is executed. Stacks usually grow continuously, each thread in each process shares the virtual memory space, and when there are multiple threads, each thread needs to be allocated a stack of different start addresses, which requires estimating the size of each line stacks before allocating the stack. To solve this problem, we have split stacks technology: When you create a stack, you allocate only a small piece of memory, and if you make a function call that causes the stack space to be low, a new stack space is allocated elsewhere. The new space does not need to be contiguous with the old stack space. The parameters of the function call are copied to the new stack space, and the next function execution is performed in the new stack space. Runtime stack management is similar to this, but for higher efficiency, the use of continuous stack (Golang continuous stack) implementation is the first allocation of a fixed size stack, when the stack space is insufficient, allocate a larger stack, and the old stack all copiesTo the new stack, which avoids the frequent memory allocations and releases that the split stacks method can cause.

Since the dispatch must have its own scheduling strategy, go using preemptive scheduling, goroutine execution can be preempted. If a goroutine has been occupying the CPU for a long time without being transferred, it will be preempted by the runtime, giving the CPU time to the other goroutine. See: Go preemptive Scheduler Design Doc Runtime automatically creates a system thread when the program starts, runs the Sysmon () function, Sysmon () functions throughout the entire program life cycle, monitors the status of each goroutine, determines whether or not to garbage collection, and so on, Sysmon () The retake () function is called, and the retake () function iterates through all p, which is preempted if a p is in the execution state and has been executed for a longer period of time.

Then retake () calls Preemptone () to set the stackguard0 of P to stackpreempt, which will cause the stack space check to fail when the next function call is performed in this p, triggering morestack (), In the Goschedimpl () function, G is unbound from m by calling DROPG (), then Globrunqput () is added to the global runnable queue, and then schedule () is called to set the new executable g for the current p.

such as: Go function can start a goroutine, so every go out a statement is executed, the Runqueue queue at the end of the add a goroutine, and at the next dispatch point, from the Runqueue out, a goroutine execution. At the same time each P can turn to another OS thread, ensuring that there are enough threads to run the context P, which means that goruntine can switch between multiple OS threads at the right time, or it can always be on a thread, as determined by the scheduler.

Gc


GC has always been the place where the Go development team has been optimizing, and its performance is getting better:
GC 1.5 vs 1.6
GC1.7
We may not be able to see the change from the graph: in the Go 1.4 version of its GC in 300 milliseconds, but in the 1.5 version of the GC has been optimized very well, compressed to 40 milliseconds. Upgrade from the 1.6 version of 15 to 20 milliseconds to the 1.63 version of 5 milliseconds. and upgraded from 1.6.3 to 1. Within 3 milliseconds of the 7 release, and also in the newly released version 1.8, GC optimizations for low latency gave us a big surprise because the GC's "Stop-the-world stack re-scanning" was eliminated, making GC STW (Stop-the-world) Time is usually less than 100 microseconds, or even less than 10 microseconds, and now the GC is not their problem anymore. The GC comes down, CPU usage goes up, 1.7.3 and 1.8, the CPU uses a bit more, CPU usage is a little higher, but the GC has a big boost, which is more or less at the expense of "throughput", so in Go 1.9, the GC improvements will continue, Will make a good balance between throughput and low latency. It should be said that after the release of version 1.8, the 1.9 version now introduces an idea--goroutine-level GC, so the 1.9 version may have a bigger boost.

GC Optimization Path:
    1. 1.3 ago, the use of a rather stupid traditional mark-sweep algorithm.
    2. Version 1.3 has been improved to change the Sweep to parallel operations.
    3. The 1.5 version has been greatly improved by using an improved tri-color tagging algorithm called "non-generational, not mobile, concurrent, three-color mark-clearing garbage collector", go in addition to the standard three-color collection, there is a secondary recycling function, to prevent the garbage to produce too fast. There are two main stages of the-MARKL phase: The GC flags the object and the memory that is no longer in use, and the sweep phase prepares for recycling. This is also divided into two sub-stages, the first stage, the suspension of the application, the end of the last sweep, and then into the concurrent Mark phase: Find the memory in use, the second phase, Mark, the end of the period, the application again paused Finally, unused memory is gradually recycled, and this phase is asynchronous and does not STW.
    4. In 1.6, the finalizer scan was moved to the concurrency phase, and the performance of the GC was significantly improved for a number of connected applications.
    5. The most improved version in the history of 1.7th, the improvements on the GC are also significant: the concurrent stack contraction, so that we achieve low latency, but also avoid the runtime tuning, as long as the use of the standard runtime can.
    6. 1.8 Elimination of GC "Stop-the-world stack re-scanning"

The go GC is now doing very well, and in the next 1.9 will be more in the GC optimization for the throughput and efficiency of the balance, we look forward to!


Reference documents:

Justify the Go language GC-20 seconds to 100 subtle evolutionary history
Goroutine Scheduler in Golang
https://www. zhihu.com/question/2086 2617
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.