This is a creation in Article, where the information may have evolved or changed.
Objective
RLP (Recursive length Prefix, recursive length prefix) is a coding algorithm for encoding arbitrary nested structure binary data, which is the main method of data serialization/deserialization in Ethereum, block, Data structures such as transactions are persisted by RLP encoding and then stored in the database, and the RLP encoding defines only two types of data: a string (such as a byte array), and a list.
Comparison of RLP and JSON
Words don't say much, directly on the code
Package Mainimport ("Encoding/json" "FMT" "RLP") type Pig struct {Name string Gender uint8 Address String}func Main () {pig: = &pig{name: "Piggy", Gender:2, Address: "England"} json_bytes, _: = json. Marshal (pig) fmt. Println (json_bytes) fmt. Println (String (json_bytes)) fmt. Println ("Length:", Len (json_bytes))//output//[123 34 78 97 109 101 34 58 34 112 105 103 103 121 34 44 34 71 101 11 0 101------------------------101 Ender ": 2," Address ":" England "}//length:47 rlp_bytes, _: = RLP. Encodetobytes (pig) fmt. Println (rlp_bytes) fmt. Println ("Length:", Len (rlp_bytes))//output//[207 133 103 103 121 2 135 (103 108)//len GTH:16}
From the above output we can see that the same struct, the JSON encoding requires 47 bytes, while the RLP encoding requires only 16 bytes, {"Name": "Piggy", "Gender": 2, "Address": Name,gender in "England"}, Address is not required. Of course, JSON coding also has its own advantages, here does not expand the description.
Defined
The RLP encoding function accepts an item. Defined as follows:
- Use a string as an item (for example, a byte array)
- A set of item lists (list) as an item
For example, an empty string can be an item, a string "cat" can be an item, a list with multiple strings is also OK, a list of nested lists can also, such as ["Monkey", ["Tony", "Kong"], "horse", [[] , "Pig", [""], "fish"].
The RLP encoding is defined as follows:
type |
First byte range |
encoded Content |
Single byte [0x00, 0x7f] |
[0x00, 0x7f] |
The byte content itself |
0-55-byte long string |
[0x80, 0xb7] |
0x80 plus string length followed by string binary contents |
A string of more than 55 bytes |
[0xb8, 0XBF] |
0xb7 plus length of string length, followed by string binary contents |
0-55-byte long list (the combined length of all items) |
[0xc0, 0xf7] |
0XC0 plus all the entries of the RLP encode concatenated lengths to get the single byte, followed by all the entries of the RLP encoded concatenation. |
The contents of the list exceed 55 bytes |
[0xf8, 0xFF] |
0XC0 plus all the entries of the RLP encode concatenated lengths of length to get the length of a single byte, followed by all the entries of the RLP encoded in concatenated lengths, and then followed by all the items of the RLP encoded concatenation |
Example
- The string "pig" = [0x83, ' P ', ' I ', ' G ']
- list ["Pig", "dog"] = [0xc8, 0x83, ' P ', ' I ', ' G ', 0x83, ' d ', ' o ', ' G ']
- empty string (' null ') = [0x80]
- Empty list = [0XC0]
- Number (' x0c ') = [0x0c]
- Number 1024x768 (' x04x00 ') = [0x82, 0x04, 0x00]
- nested lists [[], [[]], [[], [[]]]] = [0xc7, 0xc0, 0xc1, 0xc0, 0xc3, 0xc0, 0xc1, 0xc0]
- String "Lorem ipsum dolor sit amet, consectetur adipisicing elit" = [0xb8, 0x38, ' l ', ' o ', ' r ', ' E ', ' m ', ', ..., ' e ', ' l ', ' I ', ' t ']
RLP Analysis
Above we can see RLP coding design idea, is to quickly judge a string of encoding by the first byte of the type, take full advantage of a byte of storage space, the 0x7f after the value of the new meaning, RLP the greatest advantage is the full use of bytes in the case, while supporting the list structure, In other words, it is easy to use RLP to store a tree-like structure.
The program is also very easy to handle RLP encoding, according to the first byte can determine the type of encoding, while calling different methods for decoding, similar to JSON encoding, support nested structure, recursive call can be the entire RLP quickly restored to a tree, or translated into a JSON structure, Easy for other programs to use.
RLP use the first byte to store the length of the number of bits, and then use subsequent bytes to indicate the length of the overall string, according to rule two calculation, RLP can support a single maximum string length of 2 64, which is undoubtedly an astronomical figure, coupled with nested rules, so theoretically RLP can encode any data.