This is a creation in Article, where the information may have evolved or changed.
The final implementation code: Https://github.com/esdb/drbuffer
This article is the first step in the entire Kafka agent implementation process: https://segmentfault.com/a/1190000004567774
Memory structure
The packet format for each write is as follows
---packet_size(uint16)---packet_body([]byte)---
The storage of variable-length data is achieved by storing the length of the packet. The goal is to store such a memory structure in a ring buffer. The overall structure of the ring buffer is as follows
--- <-- nextReadFrompacket 1---packet 2--- <-- nextWriteFrom ...
Except that each packet is a variable-length structure, there seems to be nothing special about it. Stores two offset, one pointing to the read position, and one pointing to the write location.
The need for persistence is not really difficult, just mmap the memory area from a file. As long as the operating system does not hang (suddenly power off) the consistency and persistence of data can be guaranteed. This persistence guarantee is sufficient for scenarios such as logging, monitoring, offline analysis, and so on. With such a persistent ring buffer, you can isolate fast writers and less reliable back ends. Let the business process have some fault tolerance to the backend. But for money-related business events, the availability of standalone and single-disk is not enough, which requires additional network replication to ensure.
Golang is very convenient to use mmap, which is directly []byte, and can even be converted to another type using the type alias. And Mmap is the GC management of the Golang, which does not increase the burden of the GC. It has to be said that compared to the jvm,golang of unsafe operation tricks (the way to die) more, the experience of the heap and out-of-heap memory is very consistent.
Difficulty one: Go around
The so-called ring buffer is a ring that goes back to the head after it is written to the tail. This is not difficult for a queue with a fixed length of members. Write to the last slot and go around 0. But for a queue with longer members, it becomes very difficult to get around. Consider several situations
---1---0---A---
The above is 3 bytes, corresponding to a complete packet, the content is "A".
---1---0---A---free slot 1---
If there is only one vacancy, then the 2 byte of the next packet's head is divided into two parts.
---1---0---A---free slot 1---free slot 2---
If there are two seats, then the next packet Head 2 byte can be written down, but the body has to go back to the head and then write. So it is visible because the member is variable length, so cannot be pre-programmed so that the tail is always a byte, a byte many. The solution is to make the point around the back become active. Leave a wrapat offset at the back, allowing the reading to follow Wrapat's instructions.
---1---0---A--- <-- wrapAtfree slot 1---
Difficulty two: Write speed is too fast, overwrite the unread area
--- <-- nextWriteFrompacket 1--- <-- nextReadFrompacket 2---packet 3---
If, as in the case above, Nextwritefrom then writes, it may overwrite the area to which the nextreadfrom points. If you do not move the Nextreadfrom, then the next time you read it will point to a wrong area, such as a bit has been misplaced the packet, thus reading the error. The crux of the matter is, how to judge Nextreadfrom will be overwritten? At first glance, this is a very simple question.
[writeFrom, writeTo)
If the Readfrom falls in this range, it can be thought to be affected. Because WriteTo points to a byte that is not actually written, it is natural to assume that Readfrom will not be affected. But once we allow such behavior, there is a very complicated situation, given
nextReadFrom == nextWriteFom
Is there more bytes to read in this case, or is it already at the end of the team? If we think we're at the end of the line, then that means we're not moving. The Readfrom pointer causes a large chunk of memory that could have been read to be skipped. If you don't think you're at the end of the line, where is the tail? The end of the team must be the wrapat of the front. Then it is necessary to maintain a scope that [nextReadFrom, wrapAt) contains valid data. This is possible, but it can be very troublesome. The simplest implementation is to require that the [writeFrom, writeTo] entire closed interval does not contain a read pointer, so that by simply judging whether the reading and writing pointers are equal, you can know that there is no data to read.
Difficulty three: How to move the pointer?
Once the write speed is too fast to move the pointer, the read pointer is a big problem.
--- <-- nextReadFrom1---0---A---1---0---B---
If Nextreadfrom as shown above, this time requires nextreadfrom to move backwards 4, nextreadfrom cannot be
--- 1---0---A---1--- <-- nextReadFrom0---B---
Because this point is not valid in the middle of a packet. So the Nextreadfrom back also if the whole packet go backwards. Here we can use one of the simplest implementations, once the Nextreadfrom is expropriated, direct the Nextreadfrom to 0, because 0 is always a valid pointing position. and 01 of times the amount of memory that is emptied is sufficient. The disadvantage is that a one-time reset of 0 will throw away a lot of unread data.
Difficulty four: the need to reread
Traditional ring buffer, reading and moving the read pointer is an operation. Once read, this area is no longer saved. But often we need to read the data out and do some processing when the processing is determined to be successful again to move the position. Otherwise, the next time you reread (the memory state is restored after the reboot), it is repeated from the last place. How inexpensive is it to support such reliable read demand?
The simple implementation is to save two read pointers, one is lastreadto (already submitted), and the other is Nextreadfrom (not submitted for confirmation)
--- <-- lastReadTo1---0---A--- <-- nextReadFrom1---0---B---
Each time a read is made, the last read location (nextreadfrom) is stored in the lastreadto. To do so, there is a potential problem that writing may overwrite two read pointers. Think carefully, only need to consider the problem of covering lastreadto, because always first cover to Lastreadto, then cover to Nextreadfrom. So do the coverage of the check to Lastreadto, once the coverage, then the Lastreadto and Nextreadfrom are set to 0, from the beginning, lose the unread part.