# # What is graceful reboot in the case of non-stop, the ability to deploy a new version of an application in place or modify its configuration has become standard for modern software systems. This article discusses the different ways to gracefully restart an application, and provides a feature-independent case to dig into the implementation details. If you are unfamiliar with teleport, teleport is the [SHH and Kubernetes Privileged access management Solution] (https://gravitational.com/teleport/) that we use Golang to design for resilient architectures. Developers and Web site Reliability Engineers (SRE) who use Go to build and maintain services should be interested in this article. # # So_reuseport vs Copy Socket background in order to advance the teleport highly available work, we have recently spent some time studying how to gracefully restart the teleport TLS and SSH port listeners [(GitHub issue #1679)] (https: github.com/gravitational/teleport/pull/1679). Our goal is to be able to update a teleport binary without having to let the instance stop the service. Marek Majkowski in his blog post [why does an NGINX worker thread assume all the load? (https://gravitational.com/teleport/) discusses two common methods. These methods can be summarized as follows: * You can set ' So_reuseport ' on the socket so that multiple processes can be bound to the same port. With this approach, you will have multiple accept queues that provide data to multiple processes. * Copy the socket and transfer it as a file to a child process and recreate the socket in the new process. Using this method, you will have an accept queue that provides data to multiple processes. In our initial discussion, we learned a few questions about ' so_reuseport '. One of our engineers used this method before and noticed that because of its multiple accept queues, the pending TCP connection is sometimes discarded. In addition, when we do these discussions, Go does not support well in a ' net. Set ' So_reuseport ' on Listener '. However, over the past few days, there has been progress on this issue and it looks like [Go soon will support setting socket properties] (https://github.com/golang/go/issues/9661). The second approach is also fascinating because of its simplicity and the familiarity of most developers with the traditional UNix's Fork/exec produces a model that will pass all open files to the child process Convention. It is important to note that the ' os/exec ' package does not actually endorse this usage. Primarily for security reasons, it only passes ' stdin ', ' stdout ' and ' stderr ' to the child process. However, the OS package does provide a lower-level primitive that can be used to pass files to subroutines, which is what we want to do. # # using the signal to switch the socket process owner before we look at the source code, it is worthwhile to know the details of how this method works. When a new teleport program is started, the process creates a listening socket on the bound port to accept all inbound traffic. For teleport, the ingress traffic is LTS and SSH traffic. We have added a handle to the [SIGUSR2] (https://www.gnu.org/software/libc/manual/html_node/Kill-Example.html) signal, which lets teleport Copy the listener socket, and then generate a new process that passes the listening socket as a file and the metadata of the socket to the process as an environment variable. Once the new process starts, he rebuilds the socket based on the incoming files and metadata, and processes the traffic it receives. It should be noted that when a socket is copied, the inbound traffic is load balanced across two sockets in polling mode. As shown, this means that for a period of time, two teleport processes will accept new connections. [] (https://raw.githubusercontent.com/studygolang/gctt-images/master/ Gracefully-restart-a-go-program-without-downtime/graceful-restart-diag-1.png) The shutdown of the parent process is the same thing, but the reverse is done. Once the teleport process receives the sigouit signal, he will start shutting down the process, stopping accepting new connections, waiting for all existing connections to break, or time-outs to occur. Once the inbound traffic is emptied, the dying process shuts down its listener socket and exits. In this case, the new process will take over all requests sent over from within. [] (https://raw.githubusercontent.com/studygolang/gctt-images/master/ gracefully-restart-a-go-program-without-downtime/graceful-restart-diag-2.png) # # Graceful Restart Walkthrough we wrote a simple program based on the above method, you can try it yourself. Source code at the end of the article, you can follow the steps below to try this example. First, compile and start the program. ' $ go build restart.go$./restart &[1] 95147$ Created Listener file descriptor for:8080.$ Curl http://localhost:808 0/hellohello from 95147! "will send the USR2 signal to the initial process. Now, when you access this HTTP entry, he returns the PID of two different processes. "$ KILL-SIGUSR2 95147user defined signal 2 signal received. Forked child 95170.$ imported listener file descriptor for:8080.$ Curl Http://localhost:8080/helloHello from 95170!$ Curl Http://localhost:8080/helloHello from 95147! After you kill the initial process, you will only get back from the new process. "$ kill-sigterm 95147signal:killed[1]+ Exit 1 Go run restart.go$ Curl Http://localhost:8080/helloHello from 95170!$ cu RL Http://localhost:8080/helloHello from 95170! "' finally kill the new process, access will be denied. "$ kill-sigterm 95170$ Curl Http://localhost:8080/hellocurl: (7) Failed to connect to localhost port 8080:connection r efused ' # # Summary and sample source code like you see, once you understand how he works, adding graceful restart functionality to go-write services is fairly straightforward and effectively improves the user experience of the service consumer. If you want to see this in teleport, we invite you to look at our references [AWS SSH and KuberNetes Fortress Machine Deployment] (Https://github.com/gravitational/teleport/tree/master/examples/aws), which contains a ansible script, The script uses an in-place graceful restart for non-stop update teleport binaries. [Golang Graceful restart case source code] (https://gist.github.com/russjones/09e7ace4c7497515f6bd0285f710c2e4) "Gopackage mainimport (" Context "" encoding/ JSON "" Flag "" FMT "" Net "" Net/http "" OS "" Os/signal "" Path/filepath "" Syscall "" Time ") type listener struct {Addr string ' JSON: "addr" ' FD int ' json: "FD" ' filename string ' JSON: ' filename ' '}func importlistener (addr string) (net. Listener, error) {//Extract the encoded Listener metadata from the environment variable. LISTENERENV: = os. Getenv ("LISTENER") if listenerenv = = "" {return nil, fmt. Errorf ("Unable to find LISTENER environment variable")}//decodes the LISTENER metadata. var l listenererr: = json. Unmarshal ([]byte (listenerenv), &l) if err! = Nil {return nil, err}if l.addr! = Addr {return nil, fmt. Errorf ("Unable to find listener for%v", addr) the}//file has been passed into the process, extracting the file descriptor and name from the metadata for listener rebuild/discover *os.filelistenerfile: = O S.newfile (UIntPtr (L.FD), l.filename) if Listenerfile = = Nil {return nil,Fmt. Errorf ("Unable to create listener file:%v", err)}defer listenerfile.close ()//Create a net. Listener from the *os. FILE.LN, err: = Net. Filelistener (listenerfile) if err! = Nil {return nil, Err}return Ln, nil}func createlistener (addr string) (net. Listener, error) {ln, err: = Net. Listen ("tcp", addr) if err! = Nil {return nil, Err}return Ln, nil}func createorimportlistener (addr string) (net. Listener, error) {//try to import a Listener for the address, if the import is successful, use. ln, err: = Importlistener (addr) If Err = = Nil {fmt. Printf ("Imported listener file descriptor for%v.\n", addr) return ln, nil}//not listener is imported, which means that the process must create one by itself. ln, err = createlistener (addr) If err! = Nil {return nil, err}fmt. Printf ("Created Listener file descriptor for%v.\n", addr) return ln, nil}func getlistenerfile (ln net. Listener) (*os. File, error) {switch T: = ln. ( Type) {case *net. Tcplistener:return t.file () case *net. Unixlistener:return t.file ()}return nil, FMT. Errorf ("Unsupported Listener:%T", LN)}func forkchild (addr string, ln net.) Listener) (*oS.process, error) {//Gets the file descriptor from listener, and the environment variable is encoded in the metadata passed to the child process. Lnfile, err: = getlistenerfile (LN) if err! = Nil {return nil, Err}defer Lnfile.close () L: = Listener{addr:addr,fd:3,filena Me:lnFile.Name (),}listenerenv, err: = json. Marshal (L) if err! = Nil {return nil, err}//will stdin, stdout, stderr, and listener into the child process. Note: The above four file descriptors are 0,1,2,3files: = []*os. File{os. Stdin,os. Stdout,os. stderr,lnfile,}//gets the current environment variable and passes in the child process. Environment: = Append (OS. Environ (), "listener=" +string (listenerenv))//Gets the current process name and working directory Execname, err: = OS. Executable () if err! = Nil {return nil, err}execdir: = filepath. Dir (execname)//Generate child process p, err: = OS. StartProcess (Execname, []string{execname}, &os. Procattr{dir:execdir,env:environment,files:files,sys: &syscall. sysprocattr{}) If err! = Nil {return nil, Err}return p, Nil}func waitforsignals (addr string, ln net. Listener, Server *http. Server) Error {Signalch: = Make (chan os. Signal, 1024x768) Signal. Notify (Signalch, Syscall. SIGHUP, Syscall. SIGUSR2, Syscall. SIGINT, Syscall. SigQUIT) for {select {case s: = <-signalch:fmt. Printf ("%v signal received.\n", s) switch s {case Syscall. sighup://Fork a child process. P, Err: = Forkchild (addr, ln) if err! = Nil {fmt. Printf ("Unable to fork Child:%v.\n", err) continue}fmt. Printf ("Forked child%v.\n", p.pid)//Create a Context in 5 seconds past, use this timeout timer to close. CTX, Cancel: = context. Withtimeout (context. Background (), 5*time. Second) defer cancel ()//returns any errors that occurred during the shutdown process. return server. Shutdown (CTX) case Syscall. sigusr2://Fork a child process. P, Err: = Forkchild (addr, ln) if err! = Nil {fmt. Printf ("Unable to fork Child:%v.\n", err) continue}//outputs the PID of the sub-process being forked and waits for more signals. Fmt. Printf ("Forked child%v.\n", p.pid) case Syscall. SIGINT, Syscall. SIGQUIT://creates a Context in the past of 5 seconds, using this timeout timer to close. CTX, Cancel: = context. Withtimeout (context. Background (), 5*time. Second) defer cancel ()//returns any errors that occurred during the shutdown process. return server. Shutdown (CTX)}}}}func Handler (w http. Responsewriter, R *http. Request) {fmt. fprintf (W, "Hello from%v!\n", Os. Getpid ())}func startserver (addr string, ln net. Listener) *http. SeRVer {http. Handlefunc ("/hello", handler) Httpserver: = &http. Server{addr:addr,}go httpserver.serve (LN) return Httpserver}func main () {//Parse command line flags for the address to Li Sten On.var Addr Stringflag. Stringvar (&addr, "addr", ": 8080", "Address to listen on.") Create (or import) a net. Listener and start a goroutine that runs//a HTTP server in that net. LISTENER.LN, err: = Createorimportlistener (addr) If err! = Nil {fmt. Printf ("Unable to create or import a listener:%v.\n", err) OS. Exit (1)}server: = StartServer (addr, ln)//wait for the copy or end of the signal err = waitforsignals (addr, ln, server) if err! = Nil {fmt. Printf ("Exiting:%v\n", err) return}fmt. Printf ("exiting.\n")} "# # If you read here Teleport is an open source software that you can get free on [GitHub] (Https://github.com/gravitational/teleport) Learn about it in depth. If you are interested in the work of teleport or other similar distributed system software, we are always looking forward to [excellent software engineer] (https://gravitational.com/careers/systems-engineer/).
via:https://gravitational.com/blog/golang-ssh-bastion-graceful-restarts/
Author: RUSSELL JONES Translator: Magichan proofreading: polaris1119
This article by GCTT original compilation, go language Chinese network honor launches
This article was originally translated by GCTT and the Go Language Chinese network. Also want to join the ranks of translators, for open source to do some of their own contribution? Welcome to join Gctt!
Translation work and translations are published only for the purpose of learning and communication, translation work in accordance with the provisions of the CC-BY-NC-SA agreement, if our work has violated your interests, please contact us promptly.
Welcome to the CC-BY-NC-SA agreement, please mark and keep the original/translation link and author/translator information in the text.
The article only represents the author's knowledge and views, if there are different points of view, please line up downstairs to spit groove
386 reads ∙1 likes