This is a creation in Article, where the information may have evolved or changed.
The original is here: http://blog.golang.org/2011/03/gobs-of-data.html, from Golang official blog.
Gob is a serialization/decoding tool with a data structure in the Golang package. In practical applications, there are already a lot of codec tools/packages/libraries, why Golang also need to develop a new Gob? Again a repeating wheel? What did Gob do? What are the advantages of GOB? This article has made a more comprehensive explanation.
—————-Translation Split Line —————-
Gob's data
In order for a data structure to be transmitted over the network or to be saved to a file, it must be encoded and then decoded. Of course, there are many coding options available: Json,xml,google's protocol buffers, and so on. And now, there is one more way that is provided by Go's GOB package.
Why define a new encoding? It will take a lot of heavy work. Why not use a ready-made format? Well, anyway, we did it! Go already has all the coded packages just mentioned (protocol buffer is in another code base, but it is one of the most downloaded packages). And in many cases, including communication with tools and systems written in other languages, these are the right choices.
But in a specific go environment, such as communication between two go-written services, this requires something that makes it easier to use and may be more efficient.
The way gobs works together with the Go language is not possible for externally defined, language-independent encodings. At the same time, many lessons have been learned from existing systems.
Goal
GOB packages have many goals at design time.
First, and most obviously, it is designed to be very easy to use. On the one hand, because Go has reflection (reflection), there is no need to get a separate interface definition language or "protocol compiler". The data structure itself provides all the information needed to encode and decode. On the other hand, this approach also means that GOB will never be able to work well with other languages, but that's fine: gob is brazen to Go as the center (er, xxx as the center, resolutely implement XXX leadership ...). )。
Efficiency is also very important. Text-based, such as XML and JSON, can be too slow for an efficient communication network. Binary encoding is a must: the binary god horse, is necessary! Echo: Must ... )!
The Gob stream must be self-explanatory. Each gob stream, read from the beginning, will consist of enough information to parse the entire stream without knowing the end of its content. This feature means that even if you forget what the gob stream in the file represents, you can always decode it.
Again, here are some of the experience gained from Google protocol buffers.
Protocol Buffer's mishap
Protocol buffers had a major impact on the design of GOB, but three features were carefully avoided. (For the time being, protocol buffer is not self-explanatory: If you do not know the definition of the data when protocol buffer is encoded, you cannot parse it.) )
First, protocol buffer works only on the struct data type of Go. You cannot encode an integer or an array at the top level, only as a field in the struct. At least in Go, this limitation does not seem to make any sense. If you want to transfer only an array or an integer, why do you want to put it in a struct first?
Second, the definition of protocol buffer may specify that the field T.x and T.Y need to be parsed, either in encoding or decoding the value of type T. Although such a required field might seem like a good idea, the implementation overhead is large because the codec must contain a specific data structure for encoding and decoding to report that the required fields are missing. This also creates a problem. After a while, someone might want to modify the data definition to remove the required fields, but this causes the client that already receives the data to crash. It is best to encode without these fields at all. (Protocol Buffer also has an optional field.) However, if we do not have a required field, all fields are optional. Wait a minute. Some discussions will also be made for optional fields. )
The third protocol buffer's mishap is the default value. When protocol buffer sets the default value on a "Default" field, the decoded structure is as if the field was set to a value. This idea is great when you have access to getter and setter control fields, but it's hard to keep it clear when the container is a primitive structure. The required fields also have the same trouble: where are the default values defined, and what are their types (UTF-8 text?). Unsigned byte string? How many are there in floating-point types? Although many seem simple, the design and implementation of protocol buffer has many accompanying problems. We decided to keep these away from gob and go back to our go journey, a very efficient default rule: Unless you set some content, it's the "0 value" of that type, which doesn't need to be transferred.
So gob finally seems to be a more generic, simple protocol buffer. And how does it work?
Value
The encoded GOB data is not a string of int8 or UInt16. Instead, it looks more like a constant of Go, whether signed or unsigned integer values are virtual, no-size-defined numbers. When you encode a int8, its value is converted to a variable-length integer without a size definition. When you encode int64, its value is also converted to a variable-length integer without a size definition. (Signed and unsigned are treated differently , but no size definition also applies to unsigned values.) If all are value 7, the bits of the online transmission are consistent. When the receiver decodes its value, it puts it into the receiver variable, possibly an arbitrary integer type. As a result, the encoder sends a 7 from int8, and the receiver may save it in int64. This is no problem: This value is always matched to an integer. (if they do not match, an error is generated.) Decoupling on the size of the variable provides some flexibility for coding: We can extend the integer type as the software evolves, but still decode the old data.
This flexibility is equally valid for pointers. All pointers are collated prior to transmission. The values of int8, *int8, **int8, ****int8, and so on, are transmitted as integer values that may be stored in any size int, or *int, or ******int, and so on. This is also a kind of flexibility.
The same reason, in decoding a struct, when its field is sent by the encoder, stored in the target side, also reflects this flexibility. Give a value like this
Type T struct {X, Y, Z int}//Only the Export field (exported fields) is encoded and decoded. var t = t{x:7, y:0, Z:8}
Only 7 and 8 are sent after encoding. Because it is zero, Y will not be sent; it is not necessary to send a 0 value.
The receiver may decode with the following structure:
Type u struct {X, Y *int8}//Note: int8 pointer var u u
The value obtained for U is only X (the address of the int8 variable with a value of 7); The Z field is ignored-where should you put it? When decoding a struct, the field matches its name and type, and only the fields on both sides will take effect. This simple approach cleverly handles the "optional fields" problem: Type T adds fields, and expired recipients can still handle the part they know. Therefore, GOB provides important features on optional fields-no additional mechanisms or identifiers are required.
Other types can be constructed from integer strings: byte arrays, strings, arrays, memory fragments, maps, and even floating point groups. The IEEE 754 floating-point definition describes a floating-point value stored as an integer, which works fine when you know its type, and we always know the type. In addition, the integers here are sent in the order of byte flipping, because the average floating-point number is like an array of small integers, where there are many 0 on the low that are not passed.
Gob also has a great feature is that Go makes it possible to encode custom types via the Gobencoder and Gobdecoder interfaces, in a sense similar to the Marshaler and Unmarshaler of the JSON package, and the stri of the FMT package Ng interface. This technique makes it possible to make certain special functions, forcing the use of constants, or hiding information when data is being transferred. Read the documentation to find out more details.
Type of transmission
The GOB package contains a description of this type when the given type is transmitted for the first time. In fact, the encoder encodes the GOB standard format, while the internal struct has a type description and identifies it with a unique number. (The hierarchy of the basic type, the type description structure, is defined when the software is started.) After a type is described, it can be referenced by a number.
So when we send the type T, the GOB encoder sends a description of T and numbers it, for example, 127. All of the data, including the first one, uses this number, so the T-value data flow looks like this:
("Define type ID" 127, definition of type T) (127, T value) (127, T value), ...
Type numbering makes it possible to describe recursive types and to send data of these types. Therefore, gob can encode the tree type:
Type Node struct {Value intleft, right *node}
(This is an exercise that lets the reader practice how the 0 default rule works, although GOB does not handle pointers.) )
With the type information, the GOB flow is fully self-explanatory. In addition to those initial types, they are already defined at the beginning.
Compiling machine
At the first transfer of a given type, the GOB package constructs a small translation machine for this type. A reflection is used on this type to construct the translation machine, but once the translation mechanism is completed, it is no longer dependent on reflection. The translator uses unsafe and other ingenious mechanisms to convert data into encoded byte streams at high speed. You can also use reflection to avoid unsafe, but it becomes noticeably slower. (Affected by the GOB implementation, Go's protocol buffer uses a similar mechanism to increase speed.) And then the same type of value uses the translated machine already compiled so that it can always have consistent encoding.
Decoding is similar, but slightly more complex. When you decode a piece of data, the GOB packet holds a byte slice of the encoded type value used to decode, plus the value to get the decoded go. The GOB package constructs a translation machine for this process: the GOB type of online transmission for the Go type of decoding. Once the decoding mechanism is created, a non-reflective engine that uses the unsafe method provides the fastest possible speed.
Use
There are a lot of secrets in the hat, but the result is an efficient, easy-to-use coding system for data transfer. Here is a complete example illustrating the different types of encoding and decoding. Notice how easy it is to send and receive data; All you need to do is put the values and variables into the GOB package, and it will do all the work.
Package Mainimport ( "bytes" " fmt" " gob" " log") type P struct { X, Y, Z int Name string}type Q str UCT { X, Y *int32 Name string}func Main () { //Initialize encoder and decoder. Typically enc and Dec are bound to network connections, and encoders and decoders run in different processes. var network bytes. Buffer//Replace network connection ENC: = Gob. Newencoder (&network)//will be written to network dec: = gob. Newdecoder (&network)//will read //encode (send) err: = enc from the network. Encode (P{3, 4, 5, "Pythagoras"}) if err! = Nil { log. Fatal ("Encode error:", err) } //decode (receive) var q q err = Dec. Decode (&q) if err! = Nil { log. Fatal ("Decode error:", err) } fmt. Printf ("%q: {%d,%d}\n", Q.name, *q.x, *Q.Y)}
You can copy the code into go Playground, compile and execute this example.
The RPC package automates encoding/decoding of method calls on the network based on Gob. This is the subject of another essay.
Details
Gob Pack documentation files, especially the Doc.go file, explain many of the details described in this article and include a complete, operational example of how to encode data. If you are interested in the implementation of GOB, this is a good starting point.
–rob Pike, March 2011