Smart contracts from beginner to proficient: Solidity assembly language

Source: Internet
Author: User
Tags idn mul variable scope

Introduction: In the previous section, we talked about developing specifications for developing smart contracts on the juice platform, and we will continue to discuss the solidity-defined assembly language in more depth in this section.
The assembly language defined by solidity can achieve the following objectives:
1. Use it to write code that is readable, even if the code is compiled from solidity.
2. Conversion from assembly language to bytecode should be as small as possible pits.
3. The control flow should be easily detected to help with formal validation and optimization.
To achieve the goal of the first and last, solidity assembly language provides high-level components such as for loops, switch statements, and function calls. In this case, you can not use the SWAP,DUP,JUMP,JUMPI statement directly, because the first two have confusing data streams, and the latter two have confusing control flows. In addition, the function form of the statement such as Mul (add (x, y), 7) is more readable than the pure script Form 7 y x add num.
The second goal is achieved by introducing an absolute phase that can only remove the higher-level constructs in a very regular manner, and still allow checking of the generated low-level assembly code. Solidity the non-native operations provided by assembly language are named lookups (function names, variable names, etc.) of user-defined identifiers, which follow simple and regular scoping rules that clean up local variables on the stack.
Scope: An identifier (tag, variable, function, assembly), where defined, is only a block-level scope (the scope extends to the block in which the block is nested). It is illegal to access a local variable across a function boundary, even if it is possible within the scope (the translator Note: This is probably the case where multiple functions are defined within a function, JavaScript has this syntax). Shadowing is not allowed. Local variables cannot be accessed before they are defined, but tags, functions, and compilations can. The assembly is a very special block structure that can be used, for example, to return code at run time, or create contracts. Externally defined assembly variables are not visible within the subassembly.
If the control flow comes to the end of the block, the pop instruction that matches the number of local variables is inserted into the bottom of the stack (the translator notes: Remove local variables because local variables are invalidated). Whenever a local variable is referenced, the code generator needs to know its current relative position in the stack, so the current so-called stack height needs to be traced. Since all local variables are removed at the end of the block, the stack height before and after entering the block should be constant, and if not, a warning will be thrown.
Why do we use high-level constructors, such as switch,for and functions.
With switch,for and functions, you can write complex code without jump and Jumpi. This makes it easier to analyze the flow of control, and allows for more formal validation and optimization.
In addition, calculating the stack height is very complex if you use jumps manually. The position of all local variables within the stack must be known, otherwise a reference to the local variable, or the automatic deletion of local variables at the end of the block will not work correctly. The offline processing mechanism correctly inserts the appropriate operation in the block unreachable place to correct the stack height to avoid the problem that the stack height calculation is not accurate when the discontinuous control flow occurs.
Example:
Let's look at an example of solidity to this intermediate offline assembly result. Together we can consider the following bytecode for the Soldity program:

contract C {  function f(uint x) returns (uint y) {    y = 1;    for (uint i = 0; i < x; i++)      y = 2 * y;  }}

It will generate the following assembly content:

{  mstore(0x40, 0x60) // store the "free memory pointer"  // function dispatcher  switch div(calldataload(0), exp(2, 226))  case 0xb3de648b {    let (r) = f(calldataload(4))    let ret := $allocate(0x20)    mstore(ret, r)    return(ret, 0x20)  }  default { revert(0, 0) }  // memory allocator  function $allocate(size) -> pos {    pos := mload(0x40)    mstore(0x40, add(pos, size))  }  // the contract function  function f(x) -> y {    y := 1    for { let i := 0 } lt(i, x) { i := add(i, 1) } {      y := mul(2, y)    }  }}

During the offline assembly phase, it compiles into the following:

{Mstore (0x40, 0x60) {let $: = div (calldataload (0), exp (2, 226)) Jumpi ($case 1, eq ($ A, 0xb3de648b)) Jump ($case Default) $case 1: {//The function Call-we put return label and arguments on the stack $ret 1 Calldataloa D (4) jump (f)//It is unreachable code. Opcodes is added that mirror the//effect of the function on the stack height:arguments is//removed and re      Turn values are introduced.      Pop Pop let r: = 0 $ret 1://The actual return point $ret 2 0x20 jump ($allocate) POPs pop let ret: = 0      $ret 2:mstore (ret, R) return (ret, 0x20)//Although it is useless, the jump is automatically inserted, Since the desugaring process is a purely syntactic operation this//does not analyze Control-flow jump ( $endswitch)} $caseDefault: {revert (0, 0) jump ($endswitch)} $endswitch:} jump ($afterFunction ) Allocate: {//We jump through the unreachable code that introDuces the function arguments jump ($start) let $retpos: = 0 Let size: = 0 $start://Output variables live in T    He same scope as the arguments and is//actually allocated.  Let pos: = 0 {pos: = Mload (0x40) Mstore (0x40, add (pos, size)}//This code replaces the arguments by    The return values and jumps back.    SWAP1 pop Swap1 Jump//Again unreachable code, corrects stack height. 0 0} F: {jump ($start) let $retpos: = 0 Let x: = 0 $start: let y: = 0 {Let I: = 0 $for _begin      : Jumpi ($for _end, Iszero (LT (i, X))) {y: = Mul (2, y)} $for _continue: {i: = Add (i, 1)}  Jump ($for _begin) $for _end:}//Here, a pop instruction would be inserted for i swap1 pops swap1 jump 0 0 } $afterFunction: Stop}

The compilation has the following four stages:
1. Parsing
2. Disassembly (removing switch,for and functions)
3. Generating the instruction stream
4. Generate byte code
We will simply specify the steps in steps 1 through 3. More detailed steps will be explained later.
Parsing, syntax
The following tasks are resolved:

  • Stream bytes into a symbolic flow, removing the C + + style annotations (a special source code-referenced comment, which is not intended to be discussed in depth here).
  • The flow of symbols to the AST of the syntax structure defined below.
  • Registers the identifier defined in the block, where the callout starts (according to the AST node's annotations), the variable can be accessed.
    The combinatorial dictionary follows the phrase defined by the solidity itself.
    Spaces are used to delimit tags, which consist of spaces, tabs, and line breaks. Annotations are regular javascript/c + + comments and are interpreted in the same way as whitespace.
    Grammar:
    Assemblyblock = ' {' assemblyitem* '} ' Assemblyitem =identifier | Assemblyblock | functionalassemblyexpression | assemblylocaldefinition | functionalassemblyassignment | assemblyassignment | labeldefinition | Assemblyswitch | assemblyfunctiondefinition | assemblyfor | ' Break ' | ' Continue ' | subassembly | ' DataSize ' (' Identifier ') ' | Linkersymbol | ' Errorlabel ' | ' Bytecodesize ' | numberliteral | StringLiteral | Hexliteralidentifier = [a-za-z_$] [a-za-z_0-9]*functionalassemblyexpression = Identifier ' (' (Assemblyitem (', ' ASSEMBL Yitem) *)? ' assemblylocaldefinition = ' let ' identifierorlist ': = ' functionalassemblyexpressionfunctionalassemblyassignment = Identifierorlist ': = ' functionalassemblyexpressionidentifierorlist = Identifier |  ' (' identifierlist ') ' identifierlist = Identifier (', ' Identifier ') *assemblyassignment = ' =: ' Identifierlabeldefinition = Identifier ': ' assemblyswitch = ' switch ' functionalassemblyexpression assemblycase* (' Default ' Assemblyblock)? Assemblycase = ' case ' FunctionalasseMblyexpression assemblyblockassemblyfunctiondefinition = ' function ' Identifier ' (' Identifierlist? ' (') ' (' Identifierlist ')? Assemblyblockassemblyfor = ' for ' (Assemblyblock | functionalassemblyexpression) functionalassemblyexpression (Assemblyblock | functionalassemblyexpression) assemblyblocksubassembly = ' assembly ' Identifier Assemblyblocklinkersymbol = ' Linkersymbol ' (' stringliteral ') ' Numberliteral = Hexnumber | decimalnumberhexliteral = ' hex ' (' "' ([0-9a-fa-f]{2}) * '" ' | ' \ ' ([0-9a-fa-f]{2}) * ' \ ') stringliteral = ' "' ([^ ' \r\n\\] | ‘\\‘ .) * ' ' ' hexnumber = ' 0x ' [0-9a-fa-f]+decimalnumber = [0-9]+

    Disassembly
    An AST transformation that removes the For,switch and function constructs. The result is still the same parser, but it is not sure what constructs to use. If you add a jumpdests that jumps only to and does not continue, add information about the stack's contents, unless there is no local variable access to the external scope or stack height as the previous instruction. The pseudo code is as follows:

    Desugar item:ast, AST =match Item {assemblyfunctiondefinition (' function ' name ' (' Arg1, ..., argn ') ' (' (') ' (' Ret1, ..., Retm ') ' Body '-><name>:{jump ($<name>_start) let $retPC: = 0 Let argn: = 0 ... let arg1: = 0$< Name>_start:let Ret1: = 0 ... let Retm: = 0{Desugar (body)}swap and POPs items so this only Ret1, ... Retm, $retPC is Left on the stackjump0 (1 + n times) to compensate removal of arg1, ..., argn and $retPC}assemblyfor (' for ' {init} Condit Ion post body)->{init//cannot be it own block because we want variable scope to extend into the body//find I such That there is no Labels $forI _* $forI _begin:jumpi ($forI _end, Iszero (condition)) {body} $forI _continue:{post}jump ($forI _begin) $forI _end:} ' break '->{//find nearest enclosing scope with label $forI _endpop all local variables that is Defin Ed at $forI _endjump ($forI _end) 0 (as many as variables were removed above)} ' continue '->{//pointbut Find nearest enclosing SCOPE with label $forI _continuepop all local variables that is defined at the current pointbut is at $forI _continuejump ($fo  Ri_continue) 0 (as many as variables were removed above)}assemblyswitch (switch condition cases (default:defaultblock)?)   ->{//find I such that there are no $switchI * label or variablelet $switchI _value: = conditionfor Each of the cases match { Case Val:-Jumpi ($switchI _casej, eq ($switchI _value, val))}if default block present: {Defaultblock jump ($swi Tchi_end)}for Each of the cases match {case val: {body}, $switchI _casej: {body jump ($switchI _end)}} $switchI _end:}f Unctionalassemblyexpression (identifier (arg1, arg2, ..., argn))->{if identifier is function <name> with n args and m ret values, {//find I such that $funcallI _* does not exist $funcallI _return argn ... arg2 arg1 jump (& lt;name>) Pop (n + 1 times) If the current context was ' let (id1, ..., IDM): = f (...) '-let id1: = 0 ... let idm: = 0 $funcAlli_return:else-0 (M times) $funcallI _return:turn The functional expression that leads to the F Unction call to a statement stream}else-Desugar (Children of node)}default node->desugar (children of Nod e)}

    generate action Stream
    during Operation Stream generation, we track the current stack height in a counter, so it is possible to access the stack's variables by name. Stack height will modify the stack after the opcode or each tab of the backward stacks adjustment. When each new local variable is introduced, it will be registered with the current stack height. If you want to access a variable (or copy its value, or assign a value to it), the appropriate DUP or swap instruction is selected based on the current stack height and the stack height at the time the variable was introduced.
    Pseudo code:

    CodeGen item:ast-Opcode_stream =match Item {assemblyblock ({items})->join (CodeGen (item) for item in items) if L AST generated opcode have continuing control flow:pop for all local variables registered at the block (including Variablesi ntroduced by labels) warn if the stack height at the-is not the same as at the start of the Blockidentifier (ID)-&gt Lookup ID in the syntactic stack of blocksmatch type of idlocal Variable, dupi where i = 1 + stack_height-stack_h Eight_of_identifier (ID) Label,//reference to be resolved during bytecode generation Push<bytecode position of L abel>subassembly-Push<bytecode position of subassembly data>functionalassemblyexpression (ID (arguments ))->join (CodeGen (ARG) for Arg in arguments.reversed ()) ID (which have to is an opcode, might is a function name later) a Ssemblylocaldefinition (Let (id1, ..., idn): = expr)->register identifiers Id1, ..., IDN as locals in current block at Current Stack HeightcodegeN (expr)-assert that expr returns n items to the stackfunctionalassemblyassignment ((ID1, ..., idn): = expr)->lookup I D1, ..., IDN in the syntactic stack of blocks, assert that they is variablescodegen (expr) for j = N, ..., i:swapi where I  = 1 + stack_height-stack_height_of_identifier (IDJ) popassemblyassignment (=: ID)->look up ID in the syntactic stack of Blocks, assert that it is a variableswapi where i = 1 + stack_height-stack_height_of_identifier (ID) poplabeldefinition (n Ame:)->jumpdestnumberliteral (num)->push<num interpreted as decimal and right-aligned>hexliteral (lit)- >push32<lit interpreted as hex and left-aligned>stringliteral (lit)->push32<lit utf-8 encoded and Left-aligned>subassembly (Assembly <name> Block)->append CodeGen (block) at the end of the Codedatasize (< name>)->assert that <name> was a subassembly->push32<size of code generated from subassembly &LT;NAME&G T;>linkersymbol (<lit>)->push32<zeros> and append position to linker table} 

    Content reference: Https://open.juzix.net/doc
    Smart Contract Development Tutorial Video: Introduction to smart contracts for blockchain series video courses

Smart contracts from beginner to proficient: Solidity assembly language

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.