This is a creation in Article, where the information may have evolved or changed.
- Video information
- Concurrency features of Go
- An example of a simple transaction processing
- Characteristics of channels
- Analytical
- Structure Channel
- Send, receive
- Blocking and recovery
- The sender is blocked
- Goroutine Run-Time scheduling
- Goroutine the specific process of being blocked
- Goroutine the specific process of resuming execution
- What if the receiver blocks first?
- Summarize
- Operation of other channel
- Non-buffered channel
select
- Why is go designed like this?
Video Info #
Understanding Channels
by Kavya Joshi
At Gophercon 2017
Https://www.youtube.com/watch?v=KBZlN0izeiY
Slide: https://github.com/gophercon/2017-talks/blob/master/KavyaJoshi-UnderstandingChannels/Kavya%20Joshi%20-% 20understanding%20channels.pdf
Blog: Https://about.sourcegraph.com/go/understanding-channels-kavya-joshi
concurrency Features of Go #
- goroutines: Performs each task independently and may execute in parallel
- channels: For communication between Goroutines, synchronization
A simple example of transaction processing #
For non-concurrent programs such as the following:
1234567 |
func Main () { tasks: = Gettasks () //Handle each task for range Tasks { Process (Task) }} |
It is easy to convert it to the concurrency mode of Go, using the typical Task Queue pattern:
12345678910111213141516171819202122 |
func main Span class= "params" > () {//create buffered channel ch: = mak E (chan Task, 3 ) //run a fixed number of Workers for i: = 0 ; i < numworkers; i++ {go worker (CH)} //send task to workers Hellatasks: = Gett Asks () for _, Task: = range hellatasks {ch <-task}. ..} func worker (ch chan Task) {for {//receive task task: = <-ch Process (Task)}} |
Features of the channels #
- Goroutine-safe, multiple goroutine can access a channel at the same time without any competition issues
- Can be used to store and pass values between goroutine
- Its semantics are first in, first Out (FIFO)
- Block and unblock that can lead to goroutine
Resolution
Construct Channel #
1234 |
// buffered Channelmake (chan3)//unbuffered channelmake (Chan Tass) |
Review the characteristics of the channel mentioned earlier, especially the first two. What would you do if you ignored the built-in channel and let you design something that has goroutines-safe and can be used to store and pass values? Many people may think that it might be possible to do this with a locked queue. Yes, in fact, inside the channel is a locked queue.
Https://golang.org/src/runtime/chan.go
123456789 |
type struct { ... BUF //point to a ring queue ... Sendx UINT //Send index recvx uint //Receive index ... Lock Mutex // Mutex } |
buf
is a simple implementation of a ring queue. sendx
and recvx
respectively to record the location of the sending and receiving. Then use a lock
mutex to ensure no competitive adventure.
For each of ch := make(chan Task, 3)
these operations, a space is allocated in the heap , a struct variable is created and initialized, and a hchan
ch
pointer to the structure is pointed hchan
.
Because it ch
is a pointer in itself, we can pass the Goroutine function call directly to the ch
past, instead of &ch
taking pointers, so all the same goroutine point to the ch
same actual memory space.
Send, Receive #
To facilitate the description, we G1
represent the goroutine of the main()
function, and G2
the goroutine of the worker.
12345678 |
//G1 funcmain() { ... for Range Tasks { ch <-task } ...} |
1234567 |
//G2 funcworkerChan Task) {for { task: =<-ch Process (Task) }} |
Simple send and Receive #
So what's the specifics of that G1
ch <- task0
?
- Get lock
enqueue(task0)
(Here is the memory copy task0)
- Release lock
This step is simple, and the next G2
thing t := <- ch
you see is how to read the data.
- Get lock
t = dequeue()
(Again, this is also a memory copy)
- Release lock
This step is also very simple. But as we can see from this operation, all the parts shared in the Goroutine have only this hchan
struct, and all of the traffic data is memory-copied . This follows one of the core concepts of Go concurrency design:
"Do not communicate by sharing memory;
Instead, share memory by communicating. "
Blocking and Recovering #
The sender is blocked #
The Assumption G2
takes a long time to process, during which the G1
continuous sending of tasks:
ch <- task1
ch <- task2
ch <- task3
But once again ch <- task4
, because there is ch
only one buffer 3
, so there is no place to put, so G1
block, when someone from the queue to take a Task, will be G1
restored. That's what we all know, but what we're concerned about today is not what's happening, but how we do it.
Goroutine run-Time scheduling #
First, Goroutine is not an operating system thread , but a user-space thread . So Goroutine is created and managed by Go runtime, not the OS, so it's lighter than the OS thread.
Of course, Goroutine will eventually run in a thread, and control how Goroutine runs in the thread is the Scheduler (scheduler) in Go Runtime.
The run-time scheduler for Go is the M:N
dispatch model, which is N
a goroutine that runs on M
an OS thread. In other words, multiple goroutine may run in an OS thread.
M:N
3 structures were used in the Go scheduler:
M
: OS Thread
G
: Goroutine
P
: Dispatch context
P
Owns a run queue, which is all the goroutine and its contexts that can be run
To run a goroutine- G
, then a thread M
must hold a context for that goroutine P
.
Goroutine the specific process of being blocked #
Then when ch <- task4
executed, the channel is full and requires pause G1
. This time,:
G1
Will invoke the runtime gopark
,
- Then the run-time scheduler of Go will take over
G1
Set the status towaiting
- Disconnect
G1
and M
the relationship between (switch out), so out of the way, in G1
M
other words, M
idle, can arrange other tasks.
- From
P
the run queue, get a goroutine that can be runG
- Create a new
G
and M
a relationship (Switch in) so G
you're ready to run.
- When the scheduler returns, the new
G
will start running, and G1
it will not run, that is, block.
As you can see from the above process, for goroutine, it G1
is blocked, the new G
start is running, and for the operating system thread M
, it is not blocked at all.
We know that OS threads are much heavier than goroutine, so here we try to avoid OS thread blocking, which can improve performance.
Goroutine the specific process of resuming execution #
Before you understand the blocking, then understand how to resume the operation. However, before we continue to understand how to recover, we need to understand hchan
this structure more first. Because, when the channel is not full, how does the scheduler know which goroutine to keep running? And how does Goroutine know where to pick up the data?
In hchan
addition to what was mentioned earlier, there are sendq
two queues defined, recvq
each representing the goroutine waiting to be sent, received, and related information.
123456789 |
type struct { ... BUF //point to a ring queue ... SENDQ WAITQ //queue waiting to be sent recvq waitq //Waiting to be received ... Lock Mutex // Mutex } |
There waitq
is a list structure of the queue, each element is a sudog
structure, its definition is roughly:
12345 |
type struct { g // waiting goroutine elem //pointing to the element that needs to be received, sent ... } |
https://golang.org/src/runtime/runtime2.go?h=sudog#L270
So in the process of previous blocking G1
, in fact:
G1
will create a variable for yourself sudog
- It is then appended to
sendq
the waiting queue to facilitate future receiver to use this information for recovery G1
.
These are all occurring before the scheduler is called .
So now let's look at how to recover.
When G2
called t := <- ch
, the channel state is that the buffer is full, and there is one G1
waiting in the Send queue, and then G2
do the following:
G2
Execute dequeue()
from the buffer queue first to get the task1
t
G2
sendq
a waiting to be sent from a shotsudog
- The value in the popup will pop
sudog
elem
enqueue()
into buf
the
- Will pop up in
sudog
the Goroutine, that is G1
, the state from waiting
change torunnable
- Then, you
G2
need to inform the scheduler that it is ready to G1
dispatch, so call goready(G1)
.
G1
The scheduler changes the status torunnable
- The scheduler will
G1
press into P
the running queue, so when scheduled at some point in the future, G1
it will start to resume running.
- Back to G2
Note that it is up G2
to the press to be responsible for the push-in G1
elem
buf
, which is an optimization. Once this G1
is done in the future, you do not have to acquire the lock again, enqueue()
release the lock. This avoids the overhead of multiple locks.
What if the receiver blocks first? #
The cooler place is the process that the receiver first blocks.
If G2
executed first t := <- ch
, it is empty at this time, buf
so it will be G2
blocked, and his process is this:
G2
Create a sudog
structure variable for yourself. Which g
is self, that is G2
, and elem
then point tot
- Press this
sudog
variable into the recvq
waiting receive queue
G2
Need to tell Goroutine that he needs to pause, so callgopark(G2)
- As before, the scheduler
G2
changes its state towaiting
- Disconnection
G2
and M
relationship
P
remove a goroutine from the run queue
- To establish new goroutine and
M
relationships.
- Return, start to continue running the new
goroutine
These should be no strangers, so when it G1
comes to sending data, what does the process look like?
G1
Can be enqueue(task)
, and then called goready(G2)
. However, we can be smarter.
We are based on hchan
the state of the structure, already known to task
enter buf
after the G2
recovery runs, will read its value, copied into t
the. So G1
you can not go buf
at all, G1
you can directly G2
give the data .
Goroutine usually have their own stacks, which do not access each other's stack of data, except the channel. Here, because we already know t
the address (through the elem
pointer), and because it is G2
not running, so we can be very safe direct assignment. When you G2
resume running, you do not need to acquire the lock again, and you do not need to operate on it buf
. This saves memory duplication and the overhead of locking operations.
Summary
Operation of other channel #
No Buffer channel #
The non-buffered channel behavior is the same as the example that was previously said to be sent directly :
- Receiver blocking → Sender writes directly to receiver's stack
- Sender blocking → accept method
sudog
read directly from Sender
select
#
Https://golang.org/src/runtime/select.go
- Lock all the channel you need to operate first
- Create one for yourself and
sudog
then add to all channel sendq
or recvq
(depending on whether it is sent or received)
- Unlock all the channel, then pause the current call
select
of the Goroutine ( gopark()
)
- Then, when any channel is available, the
select
Goroutine is scheduled to execute.
- Resuming mirrors the pause sequence
Why is go designed like this? #
Simplicity #
Prefer a locked queue, rather than a lock-free implementation.
"Performance gains are not from thin air, but increase with complexity. "-Dvyokov
While the latter may be better performance, this advantage does not necessarily overcome the disadvantages of the resulting code complexity.
Performance #
- Call the Go Runtime Scheduler to keep OS threads from being blocked
Read and write across Goroutine stacks.
- You can let Goroutine wake up without acquiring a lock.
- You can avoid some memory duplication
Of course, any advantage will have its price . The cost here is the complexity of the implementation, so there are more complex memory management mechanisms, garbage collection, and stack shrinkage mechanisms.
The advantages of improving performance here are greater than the disadvantage of the increase in complexity.
So in the various codes implemented by the channel, we can see the result of this tradeoff between simplicity vs performance .