This is a creation in Article, where the information may have evolved or changed.
Background
In recently developed projects, the backend needs to write many APIs that provide HTTP interfaces, while the technology selection is relatively loose, so choose the Golang + Beego framework for development. The choice of Golang, mainly considering the development of the module, all need to accept the instantaneous large concurrency, the request needs to undergo a number of steps, processing time is long, unable to synchronize the immediate return results of the scene, Golang Goroutine and channel provided by the language level of the characteristics, Just to meet the needs of this area.
Goroutine different from Thread,threads is the description of a standalone running instance in the operating system, different operating systems, the implementation of thread is not the same, but the operating system does not know the existence of Goroutine, Goroutine scheduling is managed by the Golang runtime. Starting the thread is less than the resources required for the process, but the context switch between the multiple thread still requires a lot of work (register/program count/stack pointer/... ), Golang has its own scheduler, many goroutine data are shared, so the goroutine between the switch will be much faster, start goroutine the cost of less resources, a Golang program at the same time there are hundreds of goroutine is very normal.
Channel, or "Pipeline", is a data structure used to pass data (which is more appropriate for a message), that is, the data can be plugged in from the channel and can be obtained from it. The channel itself has no magical place, but with Goroutine, the channel is a simple and powerful request-processing model in which n work Goroutine will process intermediate results or end results into a channel, In addition there are M work goroutine from this channel to take data, and then further processing, by combining this process, thus competent for a variety of complex business models.
Model
In the process of practice, I have produced several working models through Goroutine + channel, which are introduced in this paper.
V0.1:go keywords
Adding the go
keyword directly allows a function to run independently of the original main function, which means that the main function proceeds directly to the rest of the operation without waiting for a very time-consuming operation to complete. For example, we are writing a service module, receiving a front-end request, and then doing a more time-consuming task. For example, the following:
func (m *Somecontroller) Porcesssometask() { var Task Models.Task if Err := Task.Parse(m.Ctx.Request); Err != Nil { m.Data["JSON"] = Err m.Servejson() return } Task.Process() m.Servejson()
If the process function takes a lot of time, the request will be blocked by the block. Sometimes, the front-end only needs to make a request to the backend, and does not require the backend to respond immediately. With this requirement, it is possible to return the request to the front end with a keyword directly in front of the time-consuming function go
, guaranteeing the experience.
func (m *Somecontroller) Porcesssometask() { var Task Models.Task if Err := Task.Parse(m.Ctx.Request); Err != Nil { m.Data["JSON"] = Err m.Servejson() return } Go Task.Process() m.Servejson()
There are, however, many limitations to this approach. Like what:
- Only in cases where the front end does not need to get the results of the backend processing immediately
- The frequency of such requests should not be large because the current practice does not control concurrency
V0.2: Concurrency control
One drawback of the previous scenario is that there is no control over concurrency, and if this type of request has a lot of time, each request starts with a goroutine, and if additional system resources are needed in each goroutine, the consumption will be uncontrollable.
In this case, one solution is to forward the request to a channel, and then initialize multiple goroutine to read and process the content in the channel. Let's say we can create a new global channel.
var Task_channel = Make (chanmodels. Task )
Then, start multiple goroutine:
for I := 0; I < Worker_num; I ++ { Go func() { for { Select { Case Task := <- Task_channel: Task.Process() } } } ()}
After the server receives the request, the task is passed into the channel:
func (m *Somecontroller) Porcesssometask() { var Task Models.Task if Err := Task.Parse(m.Ctx.Request); Err != Nil { m.Data["JSON"] = Err m.Servejson() return } //go task. Process () Task_channel <- Task m.Servejson()}
In this way, the concurrency of this operation can be WORKER_NUM
controlled through.
V0.3: Dealing with Channel-full conditions
However, the above scenario has a bug: that is, the channel initialization is not set length , so when all the WORKER_NUM
goroutine are processing the request, then there is a request, there will still be block of the situation, And it will be slower than the one that is not optimized (because you need to wait for the end of a goroutine to process it). Therefore, you need to add a length when the channel is initialized:
var Task_channel = Make (chanmodels. Task , Task_channel_len )
In this way, we will TASK_CHANNEL_LEN
set it large enough for the request to receive a request at the same time without TASK_CHANNEL_LEN
worrying about being block. However, this is still problematic: what if there is more than one request at the same time TASK_CHANNEL_LEN
? On the one hand, this should be considered as an architectural problem, can be achieved through the expansion of the module and other operations to solve. On the other hand, the module itself should also consider how to "gracefully downgrade". In this case, we should expect the module to be able to inform the caller in a timely manner, "I've reached the limit and can't handle the request." In fact, this requirement can be easily implemented in Golang: If the channel is sent and the receive operation executes in the SELECT statement and the blocking occurs, the default statement executes immediately .
Select {casetask_channel<-TASK: //do Nothing default : //warnning! return FMT . Errorf ("Task_channel is full!" )}//...
V0.4: Results returned after receiving send to channel
If the handler is more complex, it will usually appear in a goroutine, and send some intermediate processing results to other goroutine to do, after the multi-channel "process" to eventually produce the results.
Then we need to send an intermediate result to a channel and get the result of processing the request. The workaround is to include a channel instance in the request and write the result back to the channel when the Goroutine processing is complete.
type Taskresponse struct { //...}type Task struct { TaskParameter somestruct Reschan *Chan Taskresponse}//...Task := Task { TaskParameter : XXX, Reschan : Make(Chan Taskresponse),}Task_channel <- TaskRes := <- Task.Reschan//...
(There may be a question here: why not put a complex task in one goroutine and execute it sequentially?) Because there is a need to consider different sub-tasks, the system resources are not the same, some are CPU-focused, some are IO-focused, so we need to set different concurrency of these subtasks, so it needs to be done by different channel + Goroutine. )
V0.5: Waiting for the return of a group of Goroutine
It is a more common process to group tasks into different goroutine and eventually merge the results of each goroutine process. It needs to be used to WaitGroup
synchronize a group of Goroutine. The general processing flow is as follows:
var WG Sync.Waitgroup for I := 0; I < Somelen; I ++ { WG.ADD(1) Go func(T Task) { defer WG. Done() //dealing with a piece of work } (Tasks[I])}WG.Wait()//Handle the rest of the work
V0.6: timeout mechanism
Even for complex, time-consuming tasks, you must set a time-out. On the one hand, it is possible for the business to have a time limit (users must see the results in XX minutes), on the other hand, the module itself can not be consumed in the task has been unable to end, so that other requests can not be normal processing. Therefore, it is also necessary to increase the timeout mechanism for the processing process.
I generally set the timeout scheme by combining the "receive results returned after receiving send to channel" as mentioned earlier, adding the outer layer waiting for the return channel select
and passing it through time.After()
to determine the timeout.
Task := Task { TaskParameter : XXX, Reschan : Make(Chan Taskresponse),}Select { Case Res := <- Task.Reschan: //... Case <- Time. After(Process_max_time): //Processing timeout}
V0.7: Broadcasting mechanism
Now that you have a timeout mechanism, you need a mechanism to tell other goroutine what's going on and quit. Obviously, it is still necessary to use the channel to communicate, the first thing to think about is to send a struct to a certain Chan. For example, to perform a task Goroutine in the parameter, add a chan struct{}
parameter of type, when receiving the message of the channel, exit the task. However, there are two issues that need to be addressed:
- How can I receive this message while carrying out a task?
- How do I notify all of the goroutine?
For the first question, it is more elegant to use the other channel as the function D output, plus select
, you can output the result, while receiving the exit signal.
On the other hand, for the case of an unknown number of execution goroutine at the same time, it is done <-struct{}{}
obviously impossible to implement it. At this point, the Golang is used for the tricky usage of the channel: When a channel is closed, all statements that are blocked because the channel is received are returned immediately . The sample code is as follows:
//Execution Partyfunc Dotask( Done <-Chan struct{}, Tasks <-Chan Task) (Chan Result) { out := Make(Chan Result) Go func() { //Close is to allow the caller's range to exit gracefully defer Close( out) for T := Range Tasks { Select { Case result <-F(Task): Case <- Done: return } } }() return out}//Callerfunc Process(Tasks <-Chan Task, Num int) { Done := Make(Chan struct{}) out := Dotask( Done, Tasks) Go func() { <- Time. After(Max_time) //done <-struct{}{} //Notify all execution Goroutine to exit Close( Done) }() //Because Goroutine is executed, or timed out, resulting in out being Close,range exited for Res := Range out { FMT.Println(Res) //... }}
Reference
- Http://blog.golang.org/pipelines
- Https://gobyexample.com/non-blocking-channel-operations
--EOF--