[Ether Square source Code Analysis] Ii. presentation and organization, caching and updating of data

Source: Internet
Author: User

This article is reproduced from: http://blog.csdn.net/teaspring/article/details/75390210

Thank the original author Teaspring share. This article has obtained the original author's reprint permission.



In the Ethereum world, the final form of data storage is [K,V] key-value pairs, the current use of the [k,v]-type underlying database is leveldb; All data related to transaction, operation, and presentation is in the form of Block (Header); , the larger granularity of blockchain (Headerchain) is formed, and if the block is used for cutting, then transaction and contract are smaller granularity; the result of all transactions or operations will exist in the state of each individual account, The account is presented in the form of StateObject, and the collection of all accounts is managed by Statedb. The following figure depicts the hierarchical relationships of each of these data units:


On the other hand, these data units, such as Block,stateobject,statedb, are heavily using Merkle-patriciatrie (MPT) data structures to organize and manage [k,v] data. Using the MPT efficient segmented hash verification mechanism and flexible node insertion/load design, callers can quickly and efficiently implement data insertion, deletion, update, compression, and encryption. The following sections provide detailed information on the above.
1. Block and Header

Block is one of the core data structures of Ethereum. The related activities of all accounts are stored in the format of the transaction (Transaction), each block has a list of the transaction objects, and the execution result of each transaction is recorded by a receipt object and a set of log objects that it contains, and a list of receipt generated after all transactions have been executed. stored in block (compressed encryption). Between blocks, through the forward pointer parenthash a concatenation of a one-way list, blockchain structure management of the linked list.

Block structure can be divided into header and body two parts, the UML relations family as shown in the following figure:


header section

The header is the core of the block, noting that its member variables are all public, making it easy to provide the caller with an operation on the Block property. The member variables of the header are all important and worth understanding: Parenthash: A pointer to a parent block (Parentblock). In addition to the Genesis block (Genesis blocks), each chunk has and has only one parent block. Coinbase: Dig out the address of the author of this block. Each time the transaction is executed, the system will give a certain amount of compensation to the ether, which is sent to this address. The RLP hash value of the member uncles of the Unclehash:block struct body. Uncles is an array of headers, and its presence is quite a ingenuity. The RLP hash value of the root node of the "State Trie" in Root:statedb. block, each account is represented by a StateObject object, and the account is uniquely marked with address, whose information is modified in the execution of the relevant transaction (Transaction). All account objects can be inserted into a merkle-patricatrie (MPT) structure, forming a "state Trie". The RLP hash value of the root node of the "TX Trie" in Txhash:block. Block's member variable transactions all the TX objects, which are inserted into a MPT structure, forming a "TX Trie". RLP hash value of the root node of the "Receipt Trie" in Receipthash:block. After all the transaction of the block is executed, a Receipt array is generated, and all the Receipt in the array are inserted into a MPT structure, forming "Receipt Trie". Bloom:bloom filter (filter), which is used to quickly determine whether a parameter log object exists in a set of known log sets. Difficulty: The difficulty of the block. Block difficulty is computed by the consensus algorithm based on Parentblock time and difficulty, and it is applied to the ' mining ' phase of the blocks.
Number: The ordinal of the block. Number of block is equal to its parent chunk number +1. Time: When the chunk should be created. Determined by the consensus algorithm, generally, either equal to Parentblock.time + 10s or equal to the current system time.
Gaslimit: The theoretical upper limit of all gas consumption in a block. This value is set when the block is created and is related to the parent block. Specifically, the gasused of the parent block is calculated according to the size relationship with Gaslimit * 2/3. Gasused: The sum of gas that is actually consumed by the execution of all transaction in the block.
Nonce: A 64bit hash number that is applied to the "mining" phase of the block and is modified in use.

Merkle-patriciatrie (MPT) is the core data structure used by Ethereum to store chunk of information. The simplest understanding is an inverted tree structure, where each node may have several child nodes, the details of the implementation of MPT in Ethereum are specifically described below.

Root,txhash and Receipthash, respectively, are taken from three MPT type objects: Statetrie, Txtrie, and Receipttrie root-node hash values. A 32byte hash value that represents a tree structure with several nodes (or an array of several elements), which is for encryption. For example, in the block synchronization process, through the Txhash received, you can confirm that the array member transactions is synchronized intact.

Among the three, the generation of Txhash and Receipthash is a little more specific, because these two data sources are arrays, unlike Statetrie. How to convert an array into a MPT structure. Considering that MPT specializes in [k,v] type data, the trick in the code is to use the index of each element in the array as K, the RLP encoded value of the element as V, to form a [k,v] key-value pair as a node, so that all the array elements are inserted into an empty MPT as a node, Form MPT structure. In the Statetrie,txtrie,receipttrie of the three MPT structures, the Receipttrie must be completed in all transactions on the block before they can be produced; Txtrie in theory, just the TX array transactions, However, it is still limited to the execution of all transactions before they are generated; Most interestingly, Statetrie, because it stores all the information about the account, such as the balance, the number of transactions initiated, the virtual machine instruction array, and so on, so as each transaction is executed, Statetrie is actually changing, This causes the root value to change as well. So statedb defines a function intermediateroot (), which is used to generate the root value for that moment:

[Plain] View plain copy//core/state/statedb.go func (S *statedb) intermediateroot (deleteemptyobjects bool) common. Hash

The return value of this function represents an immediate state of all account information.

Regarding the generation time of the header.root, the entry function of the transaction execution stateprocessor.process () called Engine.finalize () before the return of the transaction executed in the previous post. It is this finalize () that calls the above Intermediateroot () function internally and assigns a value to the header. Root. So the root value is the instant state of all account information after all transactions in the block have been completed. body Structure

Block member variable TD is the entire chunk of the list from the source of Genesis, to the current block cut-off, the cumulative difficulty of all the blocks, TD named Totaldifficulty. Conceptually, the difference between a block and the TD of a parent block equals the difficulty value of the block header.

The body can be understood as an array member set in block, which requires more memory space than the header, so it is often separated from the header when data is transmitted and validated.

Uncles is a very special member of the body, from the business function, it is not a block structure must be, its appearance will certainly occupy the entire block to compute the hash value longer time, The aim is to counteract the extremely powerful nodes in the entire Ethereum network that have a significant impact on the generation of blocks, preventing these nodes from destroying the fundamental tenet of "going central". Official description Visible Ethereum-wiki

Unique identifier of the block

Similar to other objects in the Ethereum world, the unique identifier of the Block object is its (RLP) hash value. Note that the hash value of the block is equal to the (RLP) hash value of its header member.

[Plain] View plain copy//core/types/block.go func (b *block) Hash () common. Hash {if hash: = B.hash.load (); hash!= nil {return hash. Common. Hash)} V: = B.header.hash () b.hash.store (v) Return v} func (H *header) Hash () common. Hash {return Rlphash (h)} tip: The member hash of the block caches the hash value computed by the last header to avoid unnecessary computations.

The hash value of the block is equal to the (RLP) hash of its header, which fundamentally makes it clear that the blocks (struct) and header represent the same chunk object. Given the differences in memory space between the two structures, this design can bring a lot of convenience. For example, in the data transmission, the header object can be transferred first, validated after the transmission of the Block object, after the receipt can also use the members of both the hash value to do mutual verification.
members are stored in the underlying database

The main member variables of the header and block are ultimately stored in the underlying database. Ethereum selected is the Leveldb, belongs to the relational database, the storage unit is [K,V] key value pair. Let's look at the specific storage mode (CORE/DATABASE_UTIL.GO)

Key Value
' h ' + num + hash Header ' s RLP raw data
' h ' + num + hash + ' t ' Td
' h ' + num + ' n ' Hash
' H ' + hash Num
' B ' + num + hash Body ' s RLP raw data
' R ' + num + hash Receipts RLP
' L ' + hash Tx/receipt Lookup Metadata

The hash here is the RLP hash of the block (or header) object, which is also called Canonical hash;num is the UInt64 type of number, the big-endian (big endian) integer. It can be found that num and hash are the most common components in the key, while NUM and hash are stored separately as value, and each time the other party must form a key. This information is strongly implied that Num (number) and hash are the two most important attributes of block: num is used to determine the location of blocks in the entire chain, and the hash is used to identify the only Block/header object.

Through the above design, all the important members of the block structure are stored in the underlying database. Once all the block object's information has been written into the database, we can use the blockchain structure to handle the entire chain.
2. Headerchain and Blockchain

The blockchain structure is used to manage the entire block one-way list, and in a Ethereum client software (such as a wallet), only one blockchain object exists. Similar to the Block/header relationship, Blockchain also has a member variable type Headerchain, which is used to manage one-way lists of all header components. Of course, the Headerchain has only one object in the global scope and is held by blockchain (Headerchain is only held by blockchain and Lightchain, Lightchain is similar to blockchain, However, the default is to process headers only, but you can still download bodies and receipts. Their UML diagrams are as follows:


In the design of the structure, blockchain has many similarities with Headechain. For example, both have the same Chainconfig object and have the same database interface behavior variables to provide read and write [k,v] data; Blockchain have members Genesisblock and Currentblock, respectively, corresponding to the Genesis block and the current block, and Headerchain has Genesisheader and Currentheader;blockchain have Bodycache,blockcache and other members to cache high-frequency call objects, And Headerchain has Headercache, Tdcache, Numbe

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.