Taking Goroutine as an example to see the related concepts of association Process

Last Update:2016-06-23 Source: Internet

Author: User

Tags posix

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is a creation in Article, where the information may have evolved or changed.

Transfer from http://wangzhezhe.github.io/blog/2016/02/17/golang-scheduler/

Basically is the online related article combing, the original intention is mainly wants to understand the next Golang in the Goroutine in the end is how, as well as the related origin and the concept. It was later discovered that it was essentially an understanding of Golang Scheduler, because Goroutine was an important module of Golang Scheduler implementation. This is a primer, basic understanding of the line, if you want to dig into the details or should look at the source, as the reference to those of the better links.

Supplemental synchronous asynchronous blocking non-blocking

Synchronous and asynchronous differences, the main concern is the message communication mechanism

The so-called synchronous invocation is the result of the caller actively waiting for the call. Emits a call that does not return until the result is obtained. Once the call returns, the return value is obtained.

The so-called asynchronous call is called after it is emitted, and the result of the call is returned directly. When an asynchronous call procedure is issued, the caller does not get the result immediately, but after the call is made, the callee notifies the caller through state, notification, or through a function callback to handle the call.

Blocking versus non-blocking is concerned with the state of the program when it waits for the call result (message return value)

A blocking call means that the current thread is suspended until the call results are returned. The calling thread will not return until the result is obtained.

A non-blocking call means that the call does not block the current thread until the result is immediately available, and the current thread continues to execute.

Attention!!! blocking and non-blocking are not related to whether synchronization is asynchronous or not.

Process Line Cheng

Basic understanding

There are quite a few differences in the general view:

Process : Independent stack space, separate heap space, scheduling between processes is done by the OS.

threads : Separate stack space, shared heap space, scheduling between kernel threads is done by the OS.

co-process : Independent stack space, shared heap space, scheduling by the user's own control, essentially a bit similar to user-level threads, these user-level thread scheduling is also implemented by themselves.

This post is ranked first in the comments organized more popular, easy to understand the introduction, first organized as follows:

The first is the origin of concurrency , the original motive is to want to macro, so that a number of programs can be executed at the same time, then the CPU is fragmented, the program internal multiple independent logical flow, the macro on a number of logical flows are executed together, of course, can also be multiple CPU parallel.

Further questions, how do I switch between multiple logical streams, and my logic a computes to half, logic B comes in, so how does the intermediate result of logical a save? So the logic of multiple concurrent executions in the same CPU naturally requires context switching. Therefore, we need the concept of process , through the virtual memory, process table, and so on to manage the program's running and switching.

Hardware further development, a computer multiple CPUs so a CPU to run a process, this is parallel , is the time in the sense of a complete joint implementation.

Because of the parallel problem, it will naturally have scheduling problems, how to dispatch to make the CPU utilization higher? This is what the kernel should consider in order to program. In essence is some kind of trade-off, because scheduling is also a cost, so it depends on whether the dispatch is worth doing.

The textbook is still very clear, because in order to meet the above-mentioned concurrency requirements, the process is the most can have resources, independent dispatch and allocation of the basic unit exists, but the process of the creation, revocation, switching is actually a lot of overhead, if the process switching too often, the system resources will be taken up by frequent costs. Then the granularity of control is further refined, that is, the "own resources" and "independent scheduling" of the two attributes separate. threads only have a small subset of resources, sharing the resources of threads. Its cost is significantly smaller than the process of switching, operating system of the book, are said to be very detailed, will not repeat.

The dispatch of that part of the function from the kernel, in the process to implement a logical flow scheduling function, which can not only achieve the advantages of concurrency, but also avoid repeated system calls, reduce the cost of thread switching, which is called the user-state thread , equivalent to the scheduling function of a more granular implementation.

User-State threads to consider: 1, encountered blocking I/O will cause the entire process to be suspended 2, due to lack of clock interrupt (specifically to see the relevant content clock interrupt CPU can be used for process switching) if an implementation makes it necessary for each thread to voluntarily surrender control by invoking a method. Then we call this user-state thread collaborative, which is called a co- process .

As can be seen in this article, it should be clear from this article how the process was proposed, why it was not adopted in the first time, and how it arose later. And the idea that the concept of the "process" is essentially the understanding of the active yield and recovery (resume) mechanisms of control flows , and the approximate implementation of the process in different languages.

For the advantages of using the co-process, refer to here, the original text also lists the Python implementation of the idea of a model.

The switch between the processes is controlled by the program itself, without the additional overhead of thread switching, and the more the number of threads compared to multithreading, the more significant the performance benefits of the coprocessor are.
No multi-threaded locking mechanism (because there is only one thread), how to take advantage of multi-core CPU? The simplest way is a multi-process + coprocessor, such as the number of concurrent processes that need to be worked out at the beginning of the Golang program, like this:

if *maxProcs < 1 {numProcs = runtime.NumCPU()} else {numProcs = *maxProcs}runtime.GOMAXPROCS(numProcs)

The implementation of Golang model

In fact, the core is how the scheduler should be implemented, which is indeed a more complex problem. This is the main reference here. The content is based on Golang's 1.1 version of the analysis, and later versions may have been improved in some places.

What do scheduler need to do when Golang runtime?

The first thing to be clear is why we need scheduler. Since the OS itself can dispatch threads, why do I need to implement a scheduler in user space?

The POSIX Thread API is actually a logical extension of a UNIX process model that already exists. This makes it more similar when controlling threads, using some places and controlling the processes. Threads can have their own signal mask, CPU affinity (CPU affinity), can be controlled by the cgroup mechanism, and can be queried for which resources they are using. All of these additional control features for threads add additional overhead. These features are not required for Golang to use Goroutine.

From the perspective of Golang itself, if the OS to do schedule, so the granularity is not very thin, schedule timing is not the best, because the OS can not know Golang at runtime some further information. For example, the Golang GC needs to guarantee the following two aspects when booting up

1, all the threads are stopped

2, memory must reach a consistent state (here the memory consistency exactly what?) This requires Golang to start the GC when the runtime waits until all the running threads have reached the memory consistent.

As you can imagine, there are many threads in some way "random" points of time being dispatched (OS mode under Clock interrupts?). ), you will often need to wait for these threads to reach a consistent state. If the scheduler is implemented by Golang itself, it can decide to dispatch when all threads reach the state of memory consistency, so it is more efficient to choose the timing of scheduling. That is, when we are ready to garbage collection, we just need to wait for the processes that are being run to stop.

(Gopher China Dave slide) Each goroutine consumes at least 2k of memory space, 2048 * 1,000,000 Goroutines = = 2Gb, that is, 2G memory of the machine, Up to 1 million of the goroutine, so in each use of the Go keyword, it is clear how goroutine will exit, if unable to explicitly answer this question, may lead to potential memory leaks, of course, some goroutine will always run, This is also a trick of GC optimization until the main function is finished running.

So:

Never start a goroutine without knowing how it'll stop

A basic introduction to the scheduler model in Golang

Typically there are three threading models:

N:1 N User-level threads and one kernel-level thread. In this mode, the switching between user-level threads can be quick, but the benefits of multicore are not well utilized.
1:11 User-level threads corresponding to one kernel-level thread. You can take advantage of multicore, but thread switching is slow because system calls are required to trap operations.

The scheduler in Golang uses M:N. Take advantage of the features of multi-core CPU systems while increasing the speed of context switching. The downside is that this can complicate the implementation of the scheduler.

You can see that the Golang Scheduler contains the following basic elements:

The triangle m represents an OS thread, which is managed by the OS and works just like the usual POSIX thread.

The round g represents a goroutine, it has its own stack,instruction pointer, the program counter, and its m and other information, as well as some of the necessary resources to dispatch Goroutine, This information is the information that needs to be saved when the CPU is goroutine, such as the blocked channel, and so on, which is to be re-load to the corresponding CPU register the next time it is dispatched.

The rectangle p represents a contextual context for scheduling, which, from an understanding, can be thought of as a scheduler running in a separate thread or as a local processor processor. This component is critical, from the N:1 Scheduler to the key part of the M:N scheduler.

is a general situation, you can see that there are two kernel thread (M), each kernel thread has a context (P), each kernel thread is also running a goroutine, in order to run Goroutines, the kernel thread must hold a context.

The number of contexts is set by the GOMAXPROCS environment variable when the Goroutine is running, and this value can be set by the runtime's Gomaxprocs () function. Typically, this value is constant during the program's run, even though Gomaxprocs is the component that is actually responsible for running Golang code, which can be adjusted by the number of actual go processes. (Gomaxprocs is what is the p in the picture? Does it feel like a container in understanding? Inside the Golang, the actual running code can be replaced but not without the operating environment.

The Goroutine in the gray mark are not running, and they are ready for dispatch (not running-but-to-scheduled). They are assigned to a list of lists, which are called runqueues. The new goroutine will be added to the tail of the runqueues. At the dispatch point, when the context needs to run a goroutine, a goroutine will pop out of the list, then set the corresponding stack and instruction pointer after the Goroutine will start to run.

In order to reduce concurrency conflicts, each context has its own runqueue (the old version seems to have only one global Runqueue), which is, of course, the most general case, which is actually more complicated than this.

A situation in which system call Syscall occurs

Why do I have to have a context (even p in the diagram)? Why can't I just let requeues run on thread? This p can be temporarily transferred to a new thread when a thread that is running is stuck.

For example, a gourutine is making a system call because a thread cannot execute code while blocking on a system call, so p can take gourutine to other OS threads, such as

Here we can see that the original kernel thread M0 gave up its own context,m0 into the block, and then the context is bound up with the new kernel thread M1. The scheduler can guarantee that there are enough thread to run these context. The original M0 is still the owner of the previous goroutine, because essentially, it is still executed, although it is blocked by the OS.

When the result of Syscall returns, M0 must find a way to hold a context to run the previous goroutine. Because according to the previous analysis, if Goroutine want to run, there must be a context as a support. The usual way is to steal a context, if there is no available context, the current goroutine may be placed in the global Runqueue, the thread will put itself into the thread cache into the sleep state.

When there is no goroutine in the context's local runqueue, they may get a goroutine from global runqueue. The context may also periodically check runqueue to see if there are any goroutine. Otherwise the goroutine in the global Runqueue may never run and will be "starved" (the context's local runqueue has been running goroutine).

The processing of Syscall determines that Golang is multithreaded in itself when it is running, even if the number of Gomaxprocs is set to 1, because when syscall occurs, p continues to run on a new thread that starts. It is known that in Golang, users are not directly allowed to create an OS-level thread, which is entirely determined by the runtime to create the thread when it is based on the actual situation. Users can create only goroutine, so that the less resources users manage, the easier things to do, the more happy users will be.

Steal work

Work stealing strategy should also be a scheduling algorithm, in general: if the context of the goroutine number is unbalanced, Golang Scheduler will also be processed accordingly, is called "steal". One scenario is to get goroutine from the global Requeue to continue running, and the other is to steal from other context, such as steal half of goroutine from other context local requeue. It's like that in. This ensures that each context has some work to do, which ensures that all threads perform their maximum performance.

Selection of scheduling time points

This reference, through the previous analysis can be seen, the use of their own implementation of the scheduler is an important aspect of the runtime itself to decide the timing of scheduling, then what is the specific situation?

Here is a list of possible basic things to write about:

The Runtime.park function is called, which may cause the goroutine to become waiting state and discard the CPU. When the channel reads and writes, the Runtime.park function may be called by the network poll in the timer.
The runtime gosched function also allows the current goroutine to discard the CPU, but it is completely different from park; Gosched is to set the Goroutine to runnable state and then into the Scheduler global wait queue Runqueue).
Some system calls trigger a reschedule. For example, before the syscall situation, runtime will have a goroutine responsible for system monitoring, goroutine scanning, if found a goroutine in the state of Syscall, as previously analyzed, Create a new m, grab that P and let this p start running goroutine. Until the system call is over, the original goroutine found that there is no p on their side, can not be executed, will be placed on the global requeue, and then the original thread will become sleep state.

Summarize

Threads granularity is too coarse, there will be a lot of extra overhead--goroutine do not need these overhead, they need finer granularity control--Golang Scheduler model (several threading model M P g meaning)--(M P G advantage) a G into the resistance When the stopper p can be transferred to other m with the other G, when the original G system call is completed, steal a p back from another place to improve resource utilization

To understand its essence through the basic model of Golang scheduler, that is, the ultimate goal is to make the most of all resources and use it elsewhere. For example, k8s's scheduling strategy.

The actual implementation of course is a very complex process, such as this from a more detailed level of analysis of the Golang scheduler, but also a reference value.

If you want to know more about it, you should make it clear that these things are essential. Compare recommend this can follow Daniel's idea piece of pieces to understand the relevant content.

There is an in-depth understanding of the language level of the way, such as many people will post, say how the language, what flaws, and so on, can follow the path of these people go down, to see exactly how the details are, so it can improve a lot.

For example, refer to this

Resources

Discussion of concurrency model in Golang and JVM

http://www.nyankosama.com/2015/04/03/java-goroutine/

Some introduction to the association Process

Http://www.cnblogs.com/wonderKK/p/4062591.html http://blog.youxu.info/2014/12/04/coroutine/

Zhihu Related Posts

http://www.zhihu.com/question/20511233

https://www.zhihu.com/question/20862617

http://www.zhihu.com/question/32218874

The past is now the future of the Association (from COBOL to the development of the entire process of evolution of the classic) Http://www.tuicool.com/articles/BNvUfeb

Http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/ 0013868328689835ecd883d910145dfa8227b539725e5ed000

Daniel's Blog contains many articles about Golang, such as the GC.

http://morsmachine.dk/

Golang Study notes for cattle (from the source point of view)

Https://github.com/qyuhen/book

About the Golang scheduler (also written more popular)

http://skoo.me/go/2013/11/29/golang-schedule/

Posted by Wangzhe 3:53 pm golang

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More