本文將從代碼層級深入分析以太坊的虛擬機器的設計原理和運行機制,以及智能合約啟動並執行相關機制。 1.虛擬機器堆棧和記憶體資料結構 虛擬機器的底層資料機構是一個堆棧,包括一個stack和一個memory。
1)我們先來看一下stack的資料結構:
// Stack is an object for basic stack operations. Items popped to the stack are
// expected to be changed and modified. stack does not take care of adding newly
// initialised objects.
type Stack struct {
data []big.Int //big.int是一個結構體,32個位元組的切片
}
func newstack() Stack {
return &Stack{data: make([]big.Int, 0, 1024)} //指定深度1024
}
以及push/pop/dup(複製棧頂元素)/peek(查看棧頂元素)/Back/swap(交換棧頂和指定元素)/require(保證棧頂元素的數量大於等於n)
2)intpool
可以重複利用的big int pool,大小為256。
type intPool struct {
pool *Stack
}
以及get/put函數,取出或設定預設值,
3)intPoolPool
intPool的管理池,預設的容量是25
type intPoolPool struct {
pools []*intPool
lock sync.Mutex
}
get/put,取出或者加入intPool,使用同步鎖來控制。
4)memory
一個簡單的記憶體模型,包含最近gas花費記錄,why?
type Memory struct {
store []byte
lastGasCost uint64
}
func NewMemory() *Memory {
return &Memory{}
}
首先使用Resize分配空間
// Resize resizes the memory to size
func (m *Memory) Resize(size uint64) {
if uint64(m.Len()) < size {
m.store = append(m.store, make([]byte, size-uint64(m.Len()))...)
}
}
再使用set來設定值
// Set sets offset + size to value
func (m *Memory) Set(offset, size uint64, value []byte) {
// length of store may never be less than offset + size.
// The store should be resized PRIOR to setting the memory
if size > uint64(len(m.store)) {
panic("INVALID memory: store empty")
}
// It's possible the offset is greater than 0 and size equals 0. This is because
// the calcMemSize (common.go) could potentially return 0 when size is zero (NO-OP)
if size > 0 {
copy(m.store[offset:offset+size], value)
}
}
以及包含Get/Getpro/Len/Data/Print等函數,其中Getpro函數中可能存在切片訪問越界的問題。
5)一些工具類函數,比如判定某stack是否可以執行dup或者swap操作:
func makeDupStackFunc/makeSwapStackFunc(n int) stackValudationFunc
2.虛擬機器指令,跳轉表和解譯器
operation標識一條操作指令所需要的函數和變數,jumptable是一個[256]operation的資料結構。
type operation struct {
// execute is the operation function
execute executionFunc //執行函數
// gasCost is the gas function and returns the gas required for execution
gasCost gasFunc //消耗函數
// validateStack validates the stack (size) for the operation
validateStack stackValidationFunc //驗證stack的大小
// memorySize returns the memory size required for the operation
memorySize memorySizeFunc //記憶體大小
halts bool // indicates whether the operation shoult halt further execution 表示操作是否停止進一步執行jumps bool // indicates whether the program counter should not increment 指示程式計數器是否不增加writes bool // determines whether this a state modifying operation 確定這是否是一個狀態修改操作valid bool // indication whether the retrieved operation is valid and known 指示檢索到的操作是否有效並且已知reverts bool // determines whether the operation reverts state (implicitly halts)確定操作是否恢複狀態(隱式停止)returns bool // determines whether the opertions sets the return data content 確定操作是否設定了返回資料內容
}
然後分別設定三個指令集:
newHomesteadInstructionSet
newByzantiumInstructionSet
newConstantinopleInstructionSet
後者在前者的基礎上產生。
instruction.go中列舉了很多具體的指令,such as:
func opPc(pc *uint64, interpreter *EVMInterpreter, contract *Contract, memory *Memory, stack Stack) ([]byte, error) {
stack.push(interpreter.intPool.get().SetUint64(pc))
return nil, nil
}
func opMsize(pc *uint64, interpreter *EVMInterpreter, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
stack.push(interpreter.intPool.get().SetInt64(int64(memory.Len())))
return nil, nil
}
gas_table.go 返回了各種指令消耗的gas的函數,基本上只有errGasUintOverflow的整數溢出錯誤。
比如說
func memoryGasCost(mem *Memory, newMemSize uint64) (uint64, error) {
if newMemSize == 0 {
return 0, nil
}
// The maximum that will fit in a uint64 is max_word_count - 1
// anything above that will result in an overflow.
// Additionally, a newMemSize which results in a
// newMemSizeWords larger than 0x7ffffffff will cause the square operation
// to overflow.
// The constant 0xffffffffe0 is the highest number that can be used without
// overflowing the gas calculation
if newMemSize > 0xffffffffe0 {
return 0, errGasUintOverflow
}
newMemSizeWords := toWordSize(newMemSize)
newMemSize = newMemSizeWords * 32
if newMemSize > uint64(mem.Len()) {
square := newMemSizeWords * newMemSizeWords
linCoef := newMemSizeWords * params.MemoryGas
quadCoef := square / params.QuadCoeffDiv
newTotalFee := linCoef + quadCoef
fee := newTotalFee - mem.lastGasCost mem.lastGasCost = newTotalFee return fee, nil}return 0, nil
}
這個Function Compute記憶體擴張的費用2,只針對擴充記憶體。nMS2 + nMS*3 - 記憶體的最近一次花費。
其中有很多各種指令的gas定義函數。
interpreter.go 解譯器
// Config are the configuration options for the Interpreter
type Config struct {
// Debug enabled debugging Interpreter options
Debug bool
// Tracer is the op code logger
Tracer Tracer
// NoRecursion disabled Interpreter call, callcode,
// delegate call and create.
NoRecursion bool
// Enable recording of SHA3/keccak preimages
EnablePreimageRecording bool
// JumpTable contains the EVM instruction table. This
// may be left uninitialised and will be set to the default
// table.
JumpTable [256]operation
}
// Interpreter is used to run Ethereum based contracts and will utilise the
// passed environment to query external sources for state information.
// The Interpreter will run the byte code VM based on the passed
// configuration.
type Interpreter interface {
// Run loops and evaluates the contract's code with the given input data and returns
// the return byte-slice and an error if one occurred.
Run(contract *Contract, input []byte) ([]byte, error)
// CanRun tells if the contract, passed as an argument, can be
// run by the current interpreter. This is meant so that the
// caller can do something like:
//
// golang // for _, interpreter := range interpreters { // if interpreter.CanRun(contract.code) { // interpreter.Run(contract.code, input) // } // } //
CanRun([]byte) bool
// IsReadOnly reports if the interpreter is in read only mode.
IsReadOnly() bool
// SetReadOnly sets (or unsets) read only mode in the interpreter.
SetReadOnly(bool)
}
//EVMInterpreter represents an EVM interpreter
type EVMInterpreter struct {
evm *EVM
cfg Config
gasTable params.GasTable // 標識了很多操作的Gas價格
intPool *intPool
readOnly bool // Whether to throw on stateful modifications
returnData []byte // Last CALL's return data for subsequent reuse 最後一個函數的傳回值
}
// NewInterpreter returns a new instance of the Interpreter.
func NewEVMInterpreter(evm *EVM, cfg Config) *Interpreter {
// We use the STOP instruction whether to see
// the jump table was initialised. If it was not
// we'll set the default jump table.
// 用一個STOP指令測試JumpTable是否已經被初始化了, 如果沒有被初始化,那麼設定為預設值
if !cfg.JumpTable[STOP].valid {
switch {
case evm.ChainConfig().IsConstantinople(evm.BlockNumber):
cfg.JumpTable = constantinopleInstructionSet
case evm.ChainConfig().IsByzantium(evm.BlockNumber):
cfg.JumpTable = byzantiumInstructionSet
case evm.ChainConfig().IsHomestead(evm.BlockNumber):
cfg.JumpTable = homesteadInstructionSet
default:
cfg.JumpTable = frontierInstructionSet
}
}
return &Interpreter{
evm: evm,
cfg: cfg,
gasTable: evm.ChainConfig().GasTable(evm.BlockNumber),
intPool: newIntPool(),
}
}
func (in *EVMInterpreter) enforceRestrictions(op OpCode, operation operation, stack *Stack) error {
if in.evm.chainRules.IsByzantium {
if in.readOnly {
// If the interpreter is operating in readonly mode, make sure no
// state-modifying operation is performed. The 3rd stack item
// for a call operation is the value. Transferring value from one
// account to the others means the state is modified and should also
// return with an error.
if operation.writes || (op == CALL && stack.Back(2).BitLen() > 0) {
return errWriteProtection
}
}
}
return nil
}
另外一個重要的函數就是run,用給定的入參迴圈執行合約的代碼,並返回return的位元組片段,如果發生錯誤則返回錯誤。解譯器返回任何錯誤除了errExecutionReverted之外都視為消耗完所有gas。
func (in *EVMInterpreter) Run(contract *Contract, input []byte) (ret []byte, err error) {
if in.intPool == nil {
in.intPool = poolOfIntPools.get()
defer func() {
poolOfIntPools.put(in.intPool)
in.intPool = nil
}()
}
// Increment the call depth which is restricted to 1024in.evm.depth++defer func() { in.evm.depth-- }()// Reset the previous call's return data. It's unimportant to preserve the old buffer// as every returning call will return new data anyway.in.returnData = nil// Don't bother with the execution if there's no code.if len(contract.Code) == 0 { return nil, nil}var ( op OpCode // current opcode mem = NewMemory() // bound memory stack = newstack() // local stack // For optimisation reason we're using uint64 as the program counter. // It's theoretically possible to go above 2^64. The YP defines the PC // to be uint256. Practically much less so feasible. pc = uint64(0) // program counter cost uint64 // copies used by tracer pcCopy uint64 // needed for the deferred Tracer gasCopy uint64 // for Tracer to log gas remaining before execution logged bool // deferred Tracer should ignore already logged steps)contract.Input = input// Reclaim the stack as an int pool when the execution stopsdefer func() { in.intPool.put(stack.data...) }() //查看是否是debug狀態if in.cfg.Debug { defer func() { if err != nil { if !logged { in.cfg.Tracer.CaptureState(in.evm, pcCopy, op, gasCopy, cost, mem, stack, contract, in.evm.depth, err) } else { in.cfg.Tracer.CaptureFault(in.evm, pcCopy, op, gasCopy, cost, mem, stack, contract, in.evm.depth, err) } } }()} for atomic.LoadInt32(&in.evm.abort) == 0 { if in.cfg.Debug { // Capture pre-execution values for tracing. logged, pcCopy, gasCopy = false, pc, contract.Gas } //得到下一個需要執行的指令 op = contract.GetOp(pc) operation := in.cfg.JumpTable[op] if !operation.valid { return nil, fmt.Errorf("invalid opcode 0x%x", int(op))} //檢查是否有足夠的堆棧空間if err := operation.validateStack(stack); err != nil { return nil, err} // If the operation is valid, enforce and write restrictions if err := in.enforceRestrictions(op, operation, stack); err != nil { return nil, err } var memorySize uint64 // calculate the new memory size and expand the memory to fit // the operation if operation.memorySize != nil { memSize, overflow := bigUint64(operation.memorySize(stack)) if overflow { return nil, errGasUintOverflow } // memory is expanded in words of 32 bytes. Gas // is also calculated in words. if memorySize, overflow = math.SafeMul(toWordSize(memSize), 32); overflow { return nil, errGasUintOverflow } } //計算gas的cost並使用,如果不夠則out of gas。 cost, err = operation.gasCost(in.gasTable, in.evm, contract, stack, mem, memorySize) if err != nil || !contract.UseGas(cost) { return nil, ErrOutOfGas } if memorySize > 0 { mem.Resize(memorySize) } if in.cfg.Debug { in.cfg.Tracer.CaptureState(in.evm, pc, op, gasCopy, cost, mem, stack, contract, in.evm.depth, err) logged = true } // execute the operation res, err := operation.execute(&pc, in, contract, mem, stack) // verifyPool is a build flag. Pool verification makes sure the integrity // of the integer pool by comparing values to a default value. if verifyPool { verifyIntegerPool(in.intPool) } // if the operation clears the return data (e.g. it has returning data) // set the last return to the result of the operation. if operation.returns {//如果有傳回值則設定傳回值,只有最後一個有效。 in.returnData = res } switch { case err != nil: return nil, err case operation.reverts: return res, errExecutionReverted case operation.halts: return res, nil case !operation.jumps: pc++ }}return nil, nil
}
虛擬機器
contract.go
type ContractRef interface{Address() common.Address} 這是一個合約背後支援對象的引用。
AccountRef 實現了上述介面。
type Contract struct {
// CallerAddress是初始化合約的使用者account,如果是合約調用初始化則設定為合約的調用者
CallerAddress common.Address
caller ContractRef
self ContractRef
jumpdests destinations // JUMPDEST 指令分析.
Code []byte //代碼
CodeHash common.Hash //代碼hash
CodeAddr *common.Address //代碼地址
Input []byte //入參
Gas uint64 //合約剩餘gas
value *big.Int //合約剩餘的eth
Args []byte //參數
DelegateCall bool
}
建構函式
func NewContract(caller ContractRef, object ContractRef, value *big.Int, gas uint64) Contract {
c := &Contract{CallerAddress: caller.Address(), caller: caller, self: object, Args: nil}
如果caller是一個合約,則jumpodests設定為caller的jumpdests.
if parent, ok := caller.(Contract); ok {
// Reuse JUMPDEST analysis from parent context if available.
c.jumpdests = parent.jumpdests
} else {
c.jumpdests = make(destinations)
}
// Gas should be a pointer so it can safely be reduced through the run
// This pointer will be off the state transition
c.Gas = gas
// ensures a value is set
c.value = value
return c
}
//為了鏈式調用,當調用者是合約時,設定本合約的CallerAddress、value為caller相應的值
func (c Contract) AsDelegate() Contract {
c.DelegateCall = true
// NOTE: caller must, at all times be a contract. It should never happen
// that caller is something other than a Contract.
parent := c.caller.(Contract)
c.CallerAddress = parent.CallerAddress
c.value = parent.value
return c
}
//GetOP用來擷取下一跳指令,推測合約code是不是已經拆成了指令合集,然後在input或者Args中擷取。
func (c *Contract) GetOp(n uint64) OpCode
//接下來的兩個