This is a creation in Article, where the information may have evolved or changed.
- Video information
- Concurrency features of Go
- An example of a simple transaction processing
- Characteristics of channels
- Analytical
- Structure Channel
- Send, receive
- Blocking and recovery
- The sender is blocked
- Goroutine Run-Time scheduling
- Goroutine the specific process of being blocked
- Goroutine the specific process of resuming execution
- What if the receiver blocks first?
- Summarize
- Operation of other channel
- Non-buffered channel
select
- Why is go designed like this?
Video Info #
Understanding Channels
by Kavya Joshi
At Gophercon 2017
Https://www.youtube.com/watch?v=KBZlN0izeiY
Slide: https://github.com/gophercon/2017-talks/blob/master/KavyaJoshi-UnderstandingChannels/Kavya%20Joshi%20-% 20understanding%20channels.pdf
Blog: Https://about.sourcegraph.com/go/understanding-channels-kavya-joshi
concurrency Features of Go #
- goroutines: Performs each task independently and may execute in parallel
- channels: For communication between Goroutines, synchronization
A simple example of transaction processing #
For non-concurrent programs such as the following:
1234567 |
func Main () { tasks: = Gettasks () //Handle each task for range Tasks { Process (Task) }} |
It is easy to convert it to the concurrency mode of Go, using the typical Task Queue pattern:
12345678910111213141516171819202122 |
func main Span class= "params" > () {//create buffered channel ch: = mak E (chan Task, 3 ) //run a fixed number of Workers for i: = 0 ; i < numworkers; i++ {go worker (CH)} //send task to workers Hellatasks: = Gett Asks () for _, Task: = range hellatasks {ch <-task}. ..} func worker (ch chan Task) {for {//receive task task: = <-ch Process (Task)}} |
Features of the channels #
- Goroutine-safe, multiple goroutine can access a channel at the same time without any competition issues
- Can be used to store and pass values between goroutine
- Its semantics are first in, first Out (FIFO)
- Block and unblock that can lead to goroutine
Resolution
Construct Channel #
1234 |
// buffered Channelmake (chan3)//unbuffered channelmake (Chan Tass) |
Review the characteristics of the channel mentioned earlier, especially the first two. What would you do if you ignored the built-in channel and let you design something that has goroutines-safe and can be used to store and pass values? Many people may think that it might be possible to do this with a locked queue. Yes, in fact, inside the channel is a locked queue.
Https://golang.org/src/runtime/chan.go
123456789 |
type struct { ... BUF //point to a ring queue ... Sendx UINT //Send index recvx uint //Receive index ... Lock Mutex // Mutex } |
bufis a simple implementation of a ring queue. sendxand recvx respectively to record the location of the sending and receiving. Then use a lock mutex to ensure no competitive adventure.
For each of ch := make(chan Task, 3) these operations, a space is allocated in the heap , a struct variable is created and initialized, and a hchan ch pointer to the structure is pointed hchan .
Because it ch is a pointer in itself, we can pass the Goroutine function call directly to the ch past, instead of &ch taking pointers, so all the same goroutine point to the ch same actual memory space.
Send, Receive #
To facilitate the description, we G1 represent the goroutine of the main() function, and G2 the goroutine of the worker.
12345678 |
//G1 funcmain() { ... for Range Tasks { ch <-task } ...} |
1234567 |
//G2 funcworkerChan Task) {for { task: =<-ch Process (Task) }} |
Simple send and Receive #
So what's the specifics of that G1 ch <- task0 ?
- Get lock
enqueue(task0)(Here is the memory copy task0)
- Release lock
This step is simple, and the next G2 thing t := <- ch you see is how to read the data.
- Get lock
t = dequeue()(Again, this is also a memory copy)
- Release lock
This step is also very simple. But as we can see from this operation, all the parts shared in the Goroutine have only this hchan struct, and all of the traffic data is memory-copied . This follows one of the core concepts of Go concurrency design:
"Do not communicate by sharing memory;
Instead, share memory by communicating. "
Blocking and Recovering #
The sender is blocked #
The Assumption G2 takes a long time to process, during which the G1 continuous sending of tasks:
ch <- task1
ch <- task2
ch <- task3
But once again ch <- task4 , because there is ch only one buffer 3 , so there is no place to put, so G1 block, when someone from the queue to take a Task, will be G1 restored. That's what we all know, but what we're concerned about today is not what's happening, but how we do it.
Goroutine run-Time scheduling #
First, Goroutine is not an operating system thread , but a user-space thread . So Goroutine is created and managed by Go runtime, not the OS, so it's lighter than the OS thread.
Of course, Goroutine will eventually run in a thread, and control how Goroutine runs in the thread is the Scheduler (scheduler) in Go Runtime.
The run-time scheduler for Go is the M:N dispatch model, which is N a goroutine that runs on M an OS thread. In other words, multiple goroutine may run in an OS thread.
M:N3 structures were used in the Go scheduler:
M: OS Thread
G: Goroutine
P: Dispatch context
POwns a run queue, which is all the goroutine and its contexts that can be run
To run a goroutine- G , then a thread M must hold a context for that goroutine P .
Goroutine the specific process of being blocked #
Then when ch <- task4 executed, the channel is full and requires pause G1 . This time,:
G1Will invoke the runtime gopark ,
- Then the run-time scheduler of Go will take over
G1Set the status towaiting
- Disconnect
G1 and M the relationship between (switch out), so out of the way, in G1 M other words, M idle, can arrange other tasks.
- From
P the run queue, get a goroutine that can be runG
- Create a new
G and M a relationship (Switch in) so G you're ready to run.
- When the scheduler returns, the new
G will start running, and G1 it will not run, that is, block.
As you can see from the above process, for goroutine, it G1 is blocked, the new G start is running, and for the operating system thread M , it is not blocked at all.
We know that OS threads are much heavier than goroutine, so here we try to avoid OS thread blocking, which can improve performance.
Goroutine the specific process of resuming execution #
Before you understand the blocking, then understand how to resume the operation. However, before we continue to understand how to recover, we need to understand hchan this structure more first. Because, when the channel is not full, how does the scheduler know which goroutine to keep running? And how does Goroutine know where to pick up the data?
In hchan addition to what was mentioned earlier, there are sendq two queues defined, recvq each representing the goroutine waiting to be sent, received, and related information.
123456789 |
type struct { ... BUF //point to a ring queue ... SENDQ WAITQ //queue waiting to be sent recvq waitq //Waiting to be received ... Lock Mutex // Mutex } |
There waitq is a list structure of the queue, each element is a sudog structure, its definition is roughly:
12345 |
type struct { g // waiting goroutine elem //pointing to the element that needs to be received, sent ... } |
https://golang.org/src/runtime/runtime2.go?h=sudog#L270
So in the process of previous blocking G1 , in fact:
G1will create a variable for yourself sudog
- It is then appended to
sendq the waiting queue to facilitate future receiver to use this information for recovery G1 .
These are all occurring before the scheduler is called .
So now let's look at how to recover.
When G2 called t := <- ch , the channel state is that the buffer is full, and there is one G1 waiting in the Send queue, and then G2 do the following:
G2Execute dequeue() from the buffer queue first to get the task1t
G2sendqa waiting to be sent from a shotsudog
- The value in the popup will pop
sudog elem enqueue() into buf the
- Will pop up in
sudog the Goroutine, that is G1 , the state from waiting change torunnable
- Then, you
G2 need to inform the scheduler that it is ready to G1 dispatch, so call goready(G1) .
G1The scheduler changes the status torunnable
- The scheduler will
G1 press into P the running queue, so when scheduled at some point in the future, G1 it will start to resume running.
- Back to G2
Note that it is up G2 to the press to be responsible for the push-in G1 elem buf , which is an optimization. Once this G1 is done in the future, you do not have to acquire the lock again, enqueue() release the lock. This avoids the overhead of multiple locks.
What if the receiver blocks first? #
The cooler place is the process that the receiver first blocks.
If G2 executed first t := <- ch , it is empty at this time, buf so it will be G2 blocked, and his process is this:
G2Create a sudog structure variable for yourself. Which g is self, that is G2 , and elem then point tot
- Press this
sudog variable into the recvq waiting receive queue
G2Need to tell Goroutine that he needs to pause, so callgopark(G2)
- As before, the scheduler
G2 changes its state towaiting
- Disconnection
G2 and M relationship
Premove a goroutine from the run queue
- To establish new goroutine and
M relationships.
- Return, start to continue running the new
goroutine
These should be no strangers, so when it G1 comes to sending data, what does the process look like?
G1Can be enqueue(task) , and then called goready(G2) . However, we can be smarter.
We are based on hchan the state of the structure, already known to task enter buf after the G2 recovery runs, will read its value, copied into t the. So G1 you can not go buf at all, G1 you can directly G2 give the data .
Goroutine usually have their own stacks, which do not access each other's stack of data, except the channel. Here, because we already know t the address (through the elem pointer), and because it is G2 not running, so we can be very safe direct assignment. When you G2 resume running, you do not need to acquire the lock again, and you do not need to operate on it buf . This saves memory duplication and the overhead of locking operations.
Summary
Operation of other channel #
No Buffer channel #
The non-buffered channel behavior is the same as the example that was previously said to be sent directly :
- Receiver blocking → Sender writes directly to receiver's stack
- Sender blocking → accept method
sudog read directly from Sender
select#
Https://golang.org/src/runtime/select.go
- Lock all the channel you need to operate first
- Create one for yourself and
sudog then add to all channel sendq or recvq (depending on whether it is sent or received)
- Unlock all the channel, then pause the current call
select of the Goroutine ( gopark() )
- Then, when any channel is available, the
select Goroutine is scheduled to execute.
- Resuming mirrors the pause sequence
Why is go designed like this? #
Simplicity #
Prefer a locked queue, rather than a lock-free implementation.
"Performance gains are not from thin air, but increase with complexity. "-Dvyokov
While the latter may be better performance, this advantage does not necessarily overcome the disadvantages of the resulting code complexity.
Performance #
- Call the Go Runtime Scheduler to keep OS threads from being blocked
Read and write across Goroutine stacks.
- You can let Goroutine wake up without acquiring a lock.
- You can avoid some memory duplication
Of course, any advantage will have its price . The cost here is the complexity of the implementation, so there are more complex memory management mechanisms, garbage collection, and stack shrinkage mechanisms.
The advantages of improving performance here are greater than the disadvantage of the increase in complexity.
So in the various codes implemented by the channel, we can see the result of this tradeoff between simplicity vs performance .