This is a creation in Article, where the information may have evolved or changed.
Before we introduced IO Packet and protocol parsing, this time we are going to talk about the Bufio package, which implements the buffered IO that is most commonly used in the project. Start with the sub-code from our previous tip and re-paste the code:
func ReadPacket(conn net.Conn) ([]byte, error) { var head [2]byte if _, err := io.ReadFull(conn, head[:]); err != nil { return err } size := binary.BigEndian.Uint16(head) packet := make([]byte, size) if _, err := io.ReadFull(conn, packet); err != nil { return err } return packet}
This sub-packet logic, the Conn executed two io.ReadFull
calls, from a small tip 1 to the IO package introduction, you can know io.ReadFull
is actually an internal loop call conn.Read()
process, so this code is very short, but the potential number of IO calls is quite many, in the best case, It should be called at least two times conn.Read()
.
What is the cost of an IO call? This has to be done from the runtime implementation of go, assuming that we are using a TCP connection, from the TCPConn.Read()
entry, we can locate the fd_unix.go
method in this file netFD.Read()
.
There is a loop call and process in this method syscall.Read()
pd.WaitRead()
, which has two main overhead.
The first is syscall.Read()
the overhead that this system call replicates data between the application buffer and the system's socket buffer.
Second pd.WaitRead()
, because the core of GO is the CSP model, to allow a thread to run multiple goroutine, the key is to allow the goroutine to wait for IO to let go of the execution thread, when the Io event arrives and then wakes Goroutine again, There is a certain overhead involved in switching back and forth.
The packet header of our sub-protocol is very small, there is a great probability that the Baotou and a portion of the package or even the entire package has been waiting for us to read in the socket buffer, this situation is very suitable for use bufio.Reader
to optimize performance.
bufio.Reader
The basic working principle is to use a pre-allocated memory as a buffer, when the real IO to fill the buffer as far as possible, when the caller reads the data from the buffer to read first, thereby reducing the number of real IO calls to play an optimization role.
Take an example of an image point: you have a bucket that can't be moved, a cup and a faucet (old-fashioned, small flow, hand-wringing switch), you have to fill the bucket with water and can't let the water of the faucet flow away in vain. At this time need to take the cup under the faucet to connect water, fill a cup immediately turn off the faucet, the cup of water poured into the bucket, and then back to open the faucet to connect water, so reciprocating until the barrel full. This process is wasted a lot of time on the switch taps and waiting cups filled with water.
If this time to take a bucket under the faucet, the faucet does not have to shut down, each time first into the bucket under the faucet scoop a glass of water, if there is no water in the bucket to tap to connect, so that the switch off the faucet and other cups filled with water. bufio.Reader
doing is such a thing.
To io.Reader
package a single bufio.Reader
line of code, our code can be transformed into the following form:
type PacketConn struct { net.Conn reader *bufio.Reader}func NewPacketConn(conn net.Conn) *PacketConn { return &PacketConn{conn, bufio.NewReader(conn)}}func (conn *PacketConn) ReadPacket() []byte { var head [2]byte if _, err := io.ReadFull(conn.reader, head[:]); err != nil { return err } size := binary.BigEndian.Uint16(head) packet := make([]byte, size) if _, err := io.ReadFull(conn.reader, packet); err != nil { return err } return packet}func (conn *PacketConn) Read(p []byte) (int, error) { return conn.reader.Read(p)}
The code logic is the same, but because of the use bufio.Reader
, in the ideal state, ReadPacket
in the first io.ReadFull
call when the subsequent data will be read into the buffer, the second time io.ReadFull
there will be no real io call generation.
Here is a detail to note, once a io.Reader
is bufio.Reader
packaged and used later, to read from this data needs to be read from the same, not one will be used in io.Reader
bufio.Reader
io.Reader
the original, and bufio.Reader
can not be read from two each bufio.Reader
, Since it is possible for each read to cache a portion of subsequent data in the buffer, if the next read is not read from the data in the buffer, then the meaning of the data read is different.
In addition to our binary subcontracting protocol, which can be leveraged bufio.Reader
to optimize performance, the parsing of text protocols can be said to be almost impossible to apply bufio.Reader
.
Let's give a simple example of a text protocol, assuming that we have a simple text protocol that uses the ' \ n ' newline character as the end of a row of data and a line of text data to be sent.
How do we bufio.Reader
read data from a line in a row without using it io.Reader
?
Obviously, we will need to write a loop (pseudo code, not compiled):
func ReadLine(reader io.Reader) (line []byte, err error) { var p = []byte{0} for { _, err := reader.Read(p) if err == io.EOF { return line, err } if err != nil { return nil, err } if p[0] == '\n' { return line, nil } line = append(line, p[0]) }}
Byte-wise calls to the Read method are obviously very inefficient, so it is clear that a buffer is needed to read and parse and buffer the residual data.
This is a common situation, such as the HTTP protocol is an opportunity to wrap the text protocol, so bufio.Reader
directly built-in ReadLine
some columns for the text protocol parsing method.
If bufio.Reader
you do not yet meet your complex protocol resolution requirements, you can also bufio
provide a Scanner
custom format resolution.
bufio
The package also provides a Writer
type for implementing a write with buffers, such as when an HTTP application enters an HTML page, it often outputs the text content of the HTML in multiple steps, and if each output is actually an IO call, the efficiency will obviously be bad, write the buffer first, Then send it to the client at once, so you can reduce the number of IO calls.
This article cannot replace Bufio
package documentation for all content, read more about the Bufio
documentation for your package.