This is a creation in Article, where the information may have evolved or changed.
0 downtime upgrades are almost standard for modern web services, and the implementation principle is not complicated ... blablabla ... (from the file descriptor, omit 10,000 words). It is now confirmed that Go can also implement a zero-downtime upgrade TCP service or a shorter term-hot update.
Original: Zero downtime upgrades of TCP servers in Go
—————-Translation Divider Line —————-
Use Go to achieve zero downtime to upgrade TCP services
A recent post on the Golang-nuts mailing list mentions that Nginx can keep the service up and running without having to stop the socket it is listening to. The trick is to cancel the close-on-exec on the listening socket, then fork and run a new service (with the upgraded binaries) and tell it to use the inherited file descriptor instead of calling Scoket () and listen (s).
So I wanted to try to see if I could do the same thing in Go and what kind of changes I would need to make to the standard library to achieve this effect. I finally realized this, and I just needed a little change, and then I explained how it was done.
The relevant code is here.
There are a lot of interesting things in this program, I will introduce each one. Then I used the "interface injection" pattern. This is an important mode in Go, but I don't think this pattern is widely accepted and written into the document.
When I started thinking about the problem, I realized that one of the problems was the need for HTTP. (*server). Serve find the method inside, and when the old server shuts down properly, let it stop calling Accept (). The problem is that there is no hook there; the only way out of the loop ("Accept, open a goroutine to handle it, and then repeat the process") is to accept the return error. But if you think that Accept is a system call, you might think: "I can't get into it and insert an error." But Accept () is not a system call: it is net. An interface to the Listener. This means that if you create a net that implements it. Listener's own object, it can be passed to HTTP. (*server). Serve then do what you want to do in the Accept ().
When I first understood the embedded type of the structure, I was very confused and lost my direction. When trying, you get all kinds of mixed pointers, and there are many unexplained null pointer errors. This time, I re-read the code and have a general idea. Type embedding is necessary when you want to inject a method of an interface. This allows the implementation of all underlying objects to be inherited and then redefined as needed. See Stoppablelistener in Upgradable.go. Net. The Listener interface requires three methods, including Accept, Close, and Addr. But I only defined one of them: Accept (). Why Stoppablelistener can achieve net. Where's Listener? Because the other two methods are implemented in an embedded way. Only the Accept () has a more explicit definition. When I write the Accept (), I need to explicitly indicate how to communicate with the underlying object in order to pass the call of the Accept (). The trick here is to understand how the embedding type creates a new field in the struct using the type name. Thus the SL can be passed StoppablelistenerSL . Listener to call the net. Listener, and can also be used by SL. Listener.accept () calls the internal Accept ().
Next, consider how to handle the "stop" error of Serve (). With the OS. Exit (0) exiting immediately is incorrect, as there may be goroutine serving HTTP clients. Need some way to understand that all the clients have been completed. Using injection again, you can return the net from Accept (). The Conn is encapsulated and then detects the number of connections that are currently running. This injects net. There are some other interesting applications for Conn object technology. For example, by capturing a Read () or Write () call, you can force a speed limit on the connection without requiring any implementation on the protocol. It's even possible to do some stupid things like encryption without having to deal with it.
When I'm sure I can close the service perfectly, I need to know how to start a new service on the correct file descriptor. This is achieved by a fairly simple change to the net package. See patches. Because of my laziness, I just realized it on the TcpListener. In theory, a 0 outage upgrade is possible for services that use other sockets, but modifications to net packages only apply to TCP services. Rog Peppe pointed out to me net. The Filelistener object can be from *os. File creation (OS can be used.) NewFile generation).
The last problem is that net always sets the Close-on-exec identity in its open socket file descriptor. So it needs to be closed on the listening socket so that the file descriptor can be used in the new process. This requires adding something to the Syscall library (off or off).
I tested it manually (in other windows with a command-line GET call). The load is also tested using Http_load. In the 20-second benchmark load test, 3937 Requests/sec scores were obtained, then the test again, add "GET http://localhost:8000/upgrade", while the load test executed several times the substitution of binary files, this time got 3880 requests/ Seconds of achievement! This is cool!