This is a creation in Article, where the information may have evolved or changed.
- Original address: How to write High-performance code in Golang using Go-routines
- Original Author: Vignesh Sk
- From: Nuggets translation program
- Permanent link to this article: Github.com/xitu/gold-m ...
- Translator: Tmpbook
- Reviewer: altairlu
How to write high-performance code using Go-routines in Golang
To write fast code with Golang, you need to look at the video of Rob Pike-Go-routines.
He is one of the authors of Golang. If you haven't seen the video, read on, and this article is my personal insight into that video content. I feel the video is not very complete. I guess Rob ignores some of the ideas he doesn't think are worth saying because of his time. But I spent a lot of time writing a comprehensive article about go-routines. I don't cover all the topics covered in the video. I'll introduce some of the projects I've used to solve Golang common problems.
Well, in order to write a quick Golang program, there are three concepts you need to fully understand, that is go-routines, closures, and plumbing.
Go-routines
Let's assume your task is to move 100 boxes from one room to another. Again, you can only move one box at a time, and it takes a minute to move it once. So, you will spend 100 minutes to carry out these 100 boxes.
Now, in order to speed up the process of moving 100 boxes, you can find a way to move the box faster (which is similar to finding a better algorithm to solve the problem) or you can hire a person to help you move the box (this is similar to increasing the CPU cores used to execute the algorithm)
This article focuses on the second approach. Write Go-routines and use one or more CPU cores to speed up application execution.
Any code block uses only one CPU core by default, unless Go-routines is declared in this code block. So, if you have a 70 line, there is no program that contains go-routines. It will be executed by a single core. Just like in our example, a core can only execute one instruction at a time. Therefore, if you want to speed up your application, you must take advantage of all CPU cores.
So, what is go-routine. How do I declare it in Golang?
Let's look at a simple program and introduce the Go-routine.
Sample Program 1
Suppose moving a box is equivalent to printing one line of standard output. So, we have 10 print statements in our instance program (because we are moving only 10 boxes without using a For loop).
"fmt"main() { fmt.Println("Box 1") fmt.Println("Box 2") fmt.Println("Box 3") fmt.Println("Box 4") fmt.Println("Box 5") fmt.Println("Box 6") fmt.Println("Box 7") fmt.Println("Box 8") fmt.Println("Box 9") fmt.Println("Box 10")}
Because Go-routines is not declared, the above code produces the following output.
Output
Box 1Box 2Box 3Box 4Box 5Box 6Box 7Box 8Box 9Box 10
So, if we want to use the extra CPU core in the process of moving the box, we need to declare a go-routine.
Sample program that contains Go-routines 2
"fmt"main() { func() { fmt.Println("Box 1") fmt.Println("Box 2") fmt.Println("Box 3") }() fmt.Println("Box 4") fmt.Println("Box 5") fmt.Println("Box 6") fmt.Println("Box 7") fmt.Println("Box 8") fmt.Println("Box 9") fmt.Println("Box 10")}
Here, a go-routine is declared and contains the first three print statements. This means that the core of the main function is executed with only 4-10 lines of statements. Another different core is assigned to execute 1-3 lines of statement blocks.
Output
Box 4Box 5Box 6Box 1Box 7Box 8Box 2Box 9Box 3Box 10
Analysis output
In this code, there are two CPU cores running at the same time, trying to perform their tasks, and both cores rely on standard output to do their corresponding tasks (because we used the print statement in this example)
In other words, the standard output (running on one of its own cores) can only accept one task at a time. So what you see here is a random sort, depending on the standard output deciding which task to accept Core1 Core2.
How do I declare go-routine?
To declare our own go-routine, we need to do three things.
- We create an anonymous function
- We call this anonymous function
- We use the "go" keyword to invoke
So, the first step is to use the syntax that defines the function, but ignore the definition function name (anonymous) to complete.
func() { fmt.Println("Box 1") fmt.Println("Box 2") fmt.Println("Box 3")}
The second step is done by adding the empty parentheses behind the anonymous method. This is a method called a named function.
func() { fmt.Println("Box 1") fmt.Println("Box 2") fmt.Println("Box 3")} ()
Step three can be done through the GO keyword. What is the Go keyword, which can declare a function block as a block of code that can run independently. In this case, it allows the code block to be executed by other idle cores on the system.
#细节 1: What happens when the number of go-routines is more than the core number?
A single core performs multiple go programs in parallel through context switches to realize the illusion of multiple cores.
#自己试试之1: Try removing the Go keyword from the example program 2. What is the output?
Answer: The results of sample program 2 and 1 are identical.
#自己试试之 2: Increase the number of statements in the anonymous function from 3 to 8. Did the results change?
Answer: Yes. The main function is a mother Go-routine (all the other go-routine are declared and created within it). So, when the mother Go-routine execution ends, even if the other go-routines are executed halfway through, they will be killed and then returned.
We now know what Go-routines is. Next, let's look at closures .
If you have not previously learned about closures in Python or JavaScript, you can now learn about them in Golang. The learned person can skip this section to save time because the closures in Golang are the same as in Python or JavaScript.
Before we dive into the closure. Let's take a look at languages that do not support closure properties, such as c,c++ and Java, in these languages,
- The function accesses only two types of variables, global variables and local variables (variables inside the function).
- There are no functions that can access variables declared in other functions.
- Once the function is executed, all the variables declared in the function will disappear.
These are not true for Golang,python or JavaScript languages that support closure properties, because these languages have the following flexibility.
- Functions can be declared within a function.
- function can return a function.
Inference #1: Because a function can be declared inside a function, a nested chain in which a function is declared within another function is a common by-product of this flexibility.
To understand why these two flexibilities have completely changed the way they work, let's look at what closures are.
So what is a closure?
In addition to accessing local variables and global variables, functions can also access all local variables declared in a function declaration, as long as they are previously declared (including all parameters passed to the closure function at run time), and in the case of nesting, the function can access the variables of all functions, regardless of the level of closure.
To understand better, let's consider a simple case, two functions, one containing another.
"fmt"main() { var one int = 1 func() { var two int = 3 fmt.Println(zero) fmt.Println(one) fmt.Println(two) fmt.Println(three) // causes compilation Error } child() var three int = 2}
There are two functions-main functions and sub-functions, where the child functions are defined in the main function. Child function Access
- Zero variable-It is a global variable
- One variable-closure property-one belongs to the main function, which is in the main function and is defined before the child function.
- The double variable-it is a local variable of the child function
Note: Although it is defined in the enclosing function "main", it cannot access the three variable because the latter declaration is behind the definition of the child function.
The same as nesting.
"fmt"closure() { var A int = 1 func() { var B int = 2 func() { var C int = 3 func() { fmt.Println(A, B, C) fmt.Println(D, E, F) // causes compilation error } var D int = 4 }() var E int = 5 }() main() { closure() global()}
If we consider associating one of the most inner functions to a global variable "global".
- It can access the A, B, C variables, and closures.
- It cannot access the D, E, F variables because they were not previously defined.
Note: Even if the closure is done, its local variables will not be destroyed. They can still be accessed by the name of the function named "global".
Here's a look at Channels.
Channels is a resource for communication between Go-routines, which can be any type.
ch := make(chan string)
We define a string type of channel called Ch. Only variables of type string can communicate through this channel.
"Hi"
This is how the message is sent to the channel.
msg := <- ch
This is how the message is received from the channel.
The operations (send and receive) in all channel are essentially blocked. This means that if a go-routine tries to send a message through the channel, it will only succeed if there is another go-routine trying to fetch messages from the channel. If there is no go-routine waiting to be received at the channel, Go-routine, as the sender, will always attempt to send a message to a receiver.
The most important point here is that all statements following the channel operation will not be executed until the channel operation is finished, and go-routine can unlock itself and execute the statement following it. This helps synchronize the various go-routine of other blocks of code.
Disclaimer: If there is only the sender's go-routine, there is no other go-routine. Then a deadlock will occur and the Go program will detect a deadlock and crash.
Note: All above mentioned also apply to receiver Go-routines.
Buffer Channels
ch := make(chan string, 100)
The buffer channels is essentially semi-blocking.
For example, CH is a 100-size buffered character channel. This means that the first 100 messages sent to it are non-blocking. The back will block out.
The usefulness of this type of channels is to release the buffer again after receiving the message from it, which means that if 100 new Go-routines programs suddenly appear and each consumes a message from the channel, the next 100 messages from the sender will become non-blocking again.
Therefore, the behavior of a buffered channel is the same as a non-buffered channel, depending on whether the buffer is idle at run time.
Channels's off.
close(ch)
This is how to close the channel. It is very helpful in Golang to avoid deadlocks. The receiver's go-routine can detect if the channel is closed as follows.
msg, ok := <- chif !ok { fmt.Println("Channel closed")}
Use Golang to write fast code
Now we are talking about the points of knowledge that have covered the go-routines, closures, channel. Considering that the algorithm for moving the box is already efficient, we can start using Golang to develop a common solution to solve the problem, and we only focus on the number of people who hire the right person for the task.
Let's take a closer look at our problem and redefine it.
We have 100 boxes to move from one room to another room. One point to note is that moving box 1 and moving box 2 involve no different work. So we can define a way to move the box, and the variable "i" represents the box being moved. The method is called "task" and the number of boxes is expressed in "n". Any "Fundamentals of computer Programming 101" course will teach you how to solve this problem: write A For Loop call "task"n, which leads to the calculation of the sheet core occupancy, and the available core in the system is a hardware problem, depending on the system's brand, model and design. So as a software developer, we pull the hardware out of our problems to discuss go-routines rather than the core. The more cores support the more go-routines, we assume that "r" is the number of go-routines supported by our "x" core system.
FYI: The number of "x" cores can handle go-routines over a quantity of "x". The number of go-routines supported by a single core (r/x) depends on how the go-routines is handled and the platform on which the runtime resides. For example, if all go-routine involve only blocking calls, such as network I/O or disk I/O, a single kernel is sufficient to handle them. This is true, because each go-routine is more waiting than the operation. Therefore, a single core can handle the context switch between all go-routine.
So the general definition of our question is
Assign "n" tasks to "r" Go-routines, where all tasks are the same.
If n≤r, we can solve it in the following way.
"fmt"var N int = 100func Task(i int) { fmt.Println("Box"main() { ack := make(chan bool, N) // Acknowledgement channel for i := 0; i < N; i++ { #1 Task(arg) true#2 #3 } for i := 0; i < N; i++ { #2 }}
Explain what we have done ...
- We create a go-routine for each task. Our system can support "r" Go-routines at the same time. As long as n≤r we do this is safe.
- We confirm that the main function returns when it waits for all go-routine to complete. We communicate its completion by waiting for all go-routine (via the closure property) to use the Acknowledgment channel ("ack").
- We pass the loop count "i" as the parameter "arg" to Go-routine, instead of directly referencing it in Go-routine by the closure property.
On the other hand, if n>r, then the workaround will be problematic. It creates go-routines that the system cannot handle. All cores try to run more go-routines than their capacity, eventually putting more time on the context switch rather than running the program (commonly known as jitter). The cost of context switching becomes more pronounced when the number of differences between N and R becomes larger. Therefore, always limit the number of go-routine to R. and assign N tasks to R go-routines.
Below we introduce the workers function
var R int = 100func Workers (task func (int)) Chan int {//point#4Input: = make (chan int)//point#1 forI: = 0; i < R; i++ {//Point#1Gofunc() { for{V, OK: = <-input//Point#2 ifOK {Task (v)//point#4}Else{returnPoint#2} } }() }returnInput//Point#3}
- Create a pool that contains "r" go-routines. No more, no more, all of "input"channel's snooping is referenced by the closure property.
- Create the Go-routines, which determines whether the channel is closed by checking the OK parameter in each loop and kills itself if the channel is closed.
- Returns the input channel to allow the caller function to assign tasks to the pool.
- Use the "task" parameter to allow the calling function to define the body of the go-routines.
Use
main() {ack := make(chan bool, N)workers := Workers(func(a int) { #2 Task(a) true #1 })for i := 0; i < N; i++ { workers <- i }for i := 0; i < N; i++ { #3 <-ack }}
By adding a statement (Point #1) to the Worker method (point #2), the closure property cleverly adds a call to the Confirm channel in the task parameter definition, and we use this loop (point #3) to make the main function have a mechanism to know all the Whether the Go-routine has completed the task. All go-routines-related logic should be included in the worker itself, as they are created in it. The main function should not know the details of the work of the internal worker functions.
Therefore, in order to achieve complete abstraction, we introduce a "climax" function that runs only after all go-routine in the pool have been completed. This is done by setting up another go-routine to check the state of the pool separately, and different types of channel types are required. The same int cannel cannot be used in all cases, so in order to write a more general worker function, we will redefine a worker function with an empty interface type.
Package Mainimport"FMT"var N int = 100var R int = 100func Task (i int) {FMT. Println ("box", i)}func Workers (Task func (interface{}), Climax Func ()) Chan interface{} {input: = Make (chan interface{}) Ack: = M Ake (Chan bool) forI: = 0; i < R; i++ {Gofunc() { for{V, OK: = <-inputifOK {Task (v) ACK <-true}Else{return}}} ()} Gofunc() { forI: = 0; i < R; i++ {<-ack} climax ()} ()returnInput}funcMain() {Exit: = Make (chan bool) Workers: = Workers (func (a interface{}) {Task (A. (int))},func() {Exit<-true}) forI: = 0; i < N; i++ {workers <-i} close (workers) <-exit}
You see, I've tried to show the power of Golang. We also examined how to write high-performance code in Golang.
Watch Rob Pike's go-routines video and spend a good time with Golang.
Until next time ...
Thanks Prateek Nischal.
The Nuggets translation program is a community of high-quality Internet technology articles that are sourced from the Nuggets to share articles in English. Content covering Android, IOS, React, front end, back end, product, design and other fields, want to see more high-quality translations please continue to follow the Nuggets translation program, official Weibo, the column.