Translation Go Language Scheduler

Source: Internet
Author: User
Tags posix
This is a creation in Article, where the information may have evolved or changed.

Go Language Scheduler

Order of Translation

This article translates Daniel morsing's blog, the Go Scheduler, which personally feels that this article takes the knowledge of Go routine and scheduler easy to understand, as an introductory article, very good.

President

Introduced

One of the biggest features of Go 1.1 is a new scheduler, contributed by Dmitry Vyukov. This new scheduler brings an exciting, no-go performance boost to the parallel going program, and I think I should write something for it.

Most of the content of this blog has been described in this original design document, which is a fairly well understood article, but slightly technical.

Although the design document already contains everything you need to know about the new scheduler, this post contains images, so it's obvious that it notch above.

Why the Go runtime requires a scheduler

Before we look at this new scheduler, we need to figure out why we need it and why we want to create a user-space scheduler, even if the operating system is already able to dispatch threads for you.

The POSIX threading API is largely a logical extension of the existing UNIX process model, and threads have many of the same controls as processes. Threads have their own signal masks, can set CPU affinity, can be grouped into cgroups, or they can query which resources they use. All of these controls add overhead because some go languages use features that are not required by goroutine, and the overhead is quickly superimposed when your program has 100,000 of threads.

Another problem is that the operating system cannot make notification of scheduling decisions, based on the go model. For example, the go garbage collector needs to be stopped at the time of collection, and the memory needs to be in a consistent state. This involves waiting for the running thread to reach the point where we know the memory is consistent.

When you have many threads that need to be randomly dispatched, there is a great possibility that you need to wait for many threads to reach a consistent state. The Go Scheduler can make decisions that are dispatched only when he knows that the memory is consistent. This means that when we stop because of garbage collection, we only need to wait for those threads that are running in the CPU core.

Our role

Threads typically have 3 models: one is the N:1 model , where multiple user threads run in a kernel thread, and the benefit of this model is that context switching is very fast, but it is not possible to take full advantage of multicore threads. The other is the 1:1 model, where one thread of execution corresponds to a system thread that takes advantage of multiple cores on the machine, but the context switch is slow because it needs to fall into the kernel.

Go takes the M:N model and tries to take advantage of both. It dispatches any number of user threads on any number of system threads, so that you can not only get a quick context switch, but also make the most of your system's multicore. The main disadvantage of this approach is the complexity that is brought to the scheduler.

To complete the scheduled task, the Go Scheduler uses 3 entities:

A triangle represents a system thread, which is managed by the operating system and behaves much like a POSIX thread. In run-time code, it is called M(machine).
A circle represents a goroutine. It contains stacks, instruction pointers, and other information that is important for scheduling goroutine, such as its blocked channel. In run-time code, it is called G.
The rectangle represents the context of the dispatch. You can think of it as a local version of the scheduler that runs the go code in a single thread. It is an important part of dispatching from N:1 to m:n. In run-time code, called P(Processor, processor).

In the figure we see 2 threads (M), each of which holds a context (P), and each context runs a goroutine (G). In order to run Goroutines, each thread must hold a context.

The number of contexts is set to the value of the environment variable at startup GOMAXPROCS , or by the run-time call function GOMAXPROCS() . In general, this value does not change during program operation. A fixed number of contexts means that only GOMAXPROCS one thread is running the go code at any given time. We can use this to adjust the call of the go process to different machines, such as running the Go code on a 4-core CPU with 4 threads.

The gray Goroutine is not in operation, but waits to be dispatched. They are arranged in a list called, runqueues and when a goroutine executes an go expression, Goroutines is added to the end of the list. When a context runs goroutine to the dispatch point, it pops a goroutine from its runqueues, sets the stack and instruction pointer, and then starts executing the goroutine.

In order to break the mutex, each context has its own local runqueues. The previous version of the Go Scheduler has only one global runqueues and one mutex to protect it. Threads are often blocked, waiting for mutexes to release contact blocking. If your machine has 32 cores, this will become very inefficient.

As long as all contexts have goroutine to execute, the GO Scheduler will be dispatched in this stable state. However, there are several exceptions to this situation.

Who do you want to call (System)?

Now you may wonder, why do you want to go to the bottom? Can we not abandon the context and put runqueues directly on the thread? Not really We need context because we can hand the context to other threads when the current running thread needs to block.
One example of blocking is when we make system calls. Since threads cannot execute code and block system calls, we need to transfer the context to continue scheduling.

As we can see, a thread abandons its context so that other threads can run it. The scheduler guarantees that there are enough threads to run all contexts. The M1 in the illustration may have just been created to handle this system call, or it comes from the thread cache. The thread that executes the system call continues to hold the goroutine that generated the system call because it is technically thrown in execution and is only blocked in the operating system.

When the system call returns, the thread must try to obtain the context before it can continue running the returned goroutine. The usual mode of operation steals a context from another thread. If the steal fails, it puts goroutine into the global Runqueue, puts itself into the thread cache and sleeps.

When the local runqueue of the context is empty, the global runqueue is pulled. The context also periodically checks for global runqueue, otherwise goroutine in global runqueue may never be able to perform a final starve.

Stealing jobs

Another situation in which the stability state of a system changes is when the runqueue of one of the contexts is empty and no goroutine can be dispatched. This can occur in the case of runqueues imbalance between contexts. This can cause the context to run out of its runqueue while the system still has work to be done. To continue executing the Go code, the context can get goroutine from the global runqueue, but if there is no goroutines in it, the context needs to be fetched from somewhere else.

The so-called other place is the other context. When a context consumes its goroutines, it steals half the goroutine from another context. This ensures that every context always has work to do, ensuring that all threads are working with their greatest ability.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.