Deep dive into the basic implementation of the Go Language network library

Last Update:2014-11-16 Source: Internet

Author: User

Tags epoll

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is a creation in Article, where the information may have evolved or changed.

The advent of Go language, let me see a language to do network programming this thing to do "right", of course, in addition to the go language, there are many languages also do this thing "correct". I have always insisted on the idea of doing "right" things, not "high-performance" things; many times, when we do system design, technology selection, are "high-performance" the three words to kidnap, of course, not to say that performance is not important, you understand.

At present, many high-performance basic network servers are developed using C language, such as: Nginx, Redis, memcached, etc., they are based on the "event-driven + event-back function" approach, that is, the use of epoll, etc. as the core driver of the network transceiver packet. Many people (including myself) think that the "event-driven + event-back function" programming method is "anti-human", because most people are more accustomed to linear processing of one thing, doing the first thing to do the second thing, is not accustomed to the N-things between the frequent switching work. In order to solve the problem that programmers need to constantly "context switch" in the development of their own brains, the go language introduces a kind of user-state thread goroutine to replace the asynchronous event-fallback function, thus re-returns to the linear, synchronous programming mode of the multithreaded concurrency model.

Write one of the simplest echo servers in the Go language:

package mainimport ("log""net")func main() {ln, err := net.Listen("tcp", ":8080")if err != nil {        log.Println(err)        return}for {        conn, err := ln.Accept()        if err != nil {            log.Println(err)            continue        }        go echoFunc(conn)}}func echoFunc(c net.Conn) {buf := make([]byte, 1024)for {        n, err := c.Read(buf)        if err != nil {            log.Println(err)            return        }        c.Write(buf[:n])}}

The main function is to first create a listening socket, and then use a for loop to accept new connections from the listening sockets, and finally call the Echofunc function to work on the established connection. The key code is:

go echoFunc(conn)

Each receive a new connection, create a "thread" to service this connection, so all business logic can be synchronously, sequentially written into the Echofunc function, no longer have to care about whether the network IO will block the problem. Regardless of the complexity of the business, the programming model of the concurrent server in the go language looks like this. To be sure, the Go language on Linux on the Web server is also used in the Epoll as the most basic data transceiver driver, go language network in the bottom implementation of the same "context switch" work, but this switch work by the runtime scheduler to do, reduce the burden of programmers.

To understand the underlying implementation of the network library, seemingly as long as the echo server in the Listen, Accept, Read, write four functions of the underlying implementation of the relationship can be. This article will be a bottom-up approach to introduce, that is, from the bottom to the upper layer of the way, which is the way I read the source code. The core source files involved in the underlying implementation include:
Net/fd_unix.go
Net/fd_poll_runtime.go
Runtime/netpoll.goc
Runtime/netpoll_epoll.c
RUNTIME/PROC.C (Scheduler)

The netpoll_epoll.c file is the implementation code of the Linux platform using Epoll as the network IO multiplexing, which can be used to understand epoll related operations (such as adding FD to Epoll, deleting FD from Epoll, etc.), only 4 functions, They are runtime netpollinit, runtime Netpollopen, runtime Netpollclose and runtime Netpoll, respectively. The init function is to create a Epoll object, the Open function is to add an FD to Epoll, the close function is to delete a fd,netpoll function from Epoll is the FD to get all occurrences from Epoll wait, The Goroutine (user-state thread) corresponding to each FD is returned through the linked list. The person who wrote the program with Epoll should be able to understand the code, nothing special.

voidruntime·netpollinit(void){epfd = runtime·epollcreate1(EPOLL_CLOEXEC);if(epfd >= 0)return;epfd = runtime·epollcreate(1024);if(epfd >= 0) {runtime·closeonexec(epfd);return;}runtime·printf("netpollinit: failed to create descriptor (%d)\n", -epfd);runtime·throw("netpollinit: failed to create descriptor");}

The runtime Netpollinit function first creates the Epoll instance using runtime Epollcreate1, and if it is not successfully created, swap runtime epollcreate again. These two create functions are equivalent to GLIBC's epoll_create1 and Epoll_create functions, respectively. Just because the go language does not directly use GLIBC, but instead encapsulates the system call itself, but the function is equivalent to glibc. Detailed information about these two create can be viewed through the man manual.

int32runtime·netpollopen(uintptr fd, PollDesc *pd){EpollEvent ev;int32 res;ev.events = EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET;ev.data = (uint64)pd;res = runtime·epollctl(epfd, EPOLL_CTL_ADD, (int32)fd, &ev);return -res;}

Adding FD to the runtime Netpollopen function in Epoll can see that each FD starts with a read-write event, and uses edge triggering, but also focuses on an uncommon new event epollrdhup, which is added in a newer kernel version, The goal is to solve the problem of closing the socket, and the epoll itself does not directly perceive the closing action. Note that any FD, when added to the Epoll, is concerned with the Epollout event, immediately producing a write event, which may be wasteful.

The related functions of the epoll operation are called in the event-driven abstraction layer, why do we need this abstraction layer? The reason is simple because the go language needs to run on different platforms, such as Linux, Unix, Mac OS x, and Windows, so the event-driven abstraction layer is required to provide a consistent interface to the network library, shielding the implementation of the event-driven platform dependency. Runtime/netpoll.goc source file is the implementation of the entire event-driven abstraction layer, the core data structure of the abstraction layer is:

struct PollDesc{PollDesc* link;// in pollcache, protected by pollcache.LockLock;// protectes the following fieldsuintptrfd;boolclosing;uintptrseq;// protects from stale timers and ready notificationsG*rg;// G waiting for read or READY (binary semaphore)Timerrt;// read deadline timer (set if rt.fv != nil)int64rd;// read deadlineG*wg;// the same for writesTimerwt;int64wd;};

Each FD added to the epoll corresponds to a polldesc struct instance, and Polldesc maintains a very important message that reads and writes the goroutine of this FD. Can be bold speculation, the implementation of the network IO Read and write operation should be: When on a FD read and write encountered Eagain error, the current goroutine stored in this fd corresponding POLLDESC, while the Goroutine to park Live, This goroutine is then re-run to ready activation until the read-write event occurs on this FD. In fact, the implementation is probably the same way.

The main thing to do in event-driven abstraction is to implement specific event-driven implementations (for example: Epoll) through a unified interface encapsulated into the go interface for use by the Net Library, the main interface is also: Create event-driven instances , add fd, delete fd , wait for events , and set deadline. runtime_pollServerInitresponsible for creating an event-driven instance, runtime_pollOpen assigning an Polldesc instance and an FD binding, and then adding the FD to Epoll, which runtime_pollClose is to remove the FD from Epoll and delete the Polldesc instance runtime_pollWait of the deleted FD binding. Interfaces are critical, and this interface is typically called when non-blocking reads and writes occur eagain errors, which is the Goroutine that park is currently reading and writing.

Runtime in the Epoll event-driven abstraction layer in fact, after entering the Net library, and was encapsulated once, this time the package from the code is mainly to facilitate operation in the pure Go language environment, Net Library in the package implementation in the Net/fd_poll_ Runtime.go files are mainly implemented by Polldesc objects:

type pollDesc struct {runtimeCtx uintptr}

Note: The Polldesc object here is not the Polldesc in the runtime mentioned above, whereas the RUNTIMECTX member of Polldesc object here is the Polldesc instance of the runtime. The main object of Polldesc is to encapsulate the event-driven abstraction layer of runtime for the network FD object to use.

var serverInit sync.Oncefunc (pd *pollDesc) Init(fd *netFD) error {serverInit.Do(runtime_pollServerInit)ctx, errno := runtime_pollOpen(uintptr(fd.sysfd))if errno != 0 {return syscall.Errno(errno)}pd.runtimeCtx = ctxreturn nil}

The Polldesc object is most concerned with its Init method, which passes through a sync. The once variable calls the Runtime_pollserverinit function, which is the function that creates the Epoll instance. This means that the Runtime_pollserverinit function will only be called once during the entire process life cycle, that is, only one Epoll instance will be created. After the Epoll instance is created, the Runtime_pollopen function is called to add the FD to the Epoll.

All socket FD In network programming is realized through Netfd object, NETFD is the abstraction of network IO operation, the implementation of Linux is in file Net/fd_unix.go. The NETFD object implementation has its own Init method, as well as the read and write methods to complete the basic IO operation, and there are a number of very useful methods besides these three methods for the user to use.

// Network file descriptor.type netFD struct {// locking/lifetime of sysfd + serialize access to Read and Write methodsfdmu fdMutex// immutable until Closesysfd       intfamily      intsotype      intisConnected boolnet         stringladdr       Addrraddr       Addr// wait serverpd pollDesc}

By defining the Netfd object, you can see that each FD has a Polldesc instance associated with it, as we know from the above that the Polldesc object is ultimately the encapsulation of Epoll.

func (fd *netFD) init() error {if err := fd.pd.Init(fd); err != nil {return err}return nil}

The init function of the Netfd object is simply the INIT function that invokes the POLLDESC instance, which is the role of adding FD to the epoll, and if the FD is the first network socket FD, this time Init will also assume the task of creating the Epoll instance. To know that in the go process, there will only be one epoll instance to manage all network socket FD, this epoll instance is created when the first network socket FD is created.

for {n, err = syscall.Read(int(fd.sysfd), p)if err != nil {n = 0if err == syscall.EAGAIN {if err = fd.pd.WaitRead(); err == nil {continue}}}err = chkReadErr(n, err, fd)break}

The code snippet above is extracted from the NETFD read method, focusing on the syscall in this for loop. The error handling of the read call. When an error occurs, it checks whether the error is syscall. Eagain, if it is, then call Waitread will currently read this FD Goroutine to park until the read event on this FD occurs again. When the new data arrives on the socket, the Waitread call returns, continuing with the For loop execution. This implementation, let the call Netfd read the place into a synchronous "blocking" mode of programming, is no longer asynchronous non-blocking programming mode. Netfd's Write method and read implementation principle is the same, are encountered eagain error when the current Goroutine to park live until the socket can be written again.

This article is just the bottom of the network Library implementation to guide the general, know the underlying code is probably where to achieve, easy to combine the source in-depth understanding. The key to high concurrency and synchronous block programming in go language is actually "goroutine and scheduler", for network IO, we need to know eagain this very key dispatch point, Master this dispatch point, even if there is no scheduler, It can also implement the scheduling of network IO operation on the basis of epoll with the user-state thread such as co-process, and achieve the purpose of synchronous blocking programming.

Finally, why is it necessary to program in a synchronous blocking way? This is a problem that can be deeply felt only when you read more and write more asynchronous non-blocking code. The real tall is definitely not--"others will not, I will, others can not write, I write it out." ”

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More