Implementation of PHP code compilation

Source: Internet
Author: User
Tags php script vars zend

1.php is an analytic high-level language, Zend kernel using C language implementation, there is a main function, PHP script is input, kernel processing output results, the kernel translates php script into C program to recognize the opcode is PHP compilation.

C language compiler compiles c code into machine code, these machine code is the operation instruction, writes the instruction to the binary program load corresponding memory area (constant area data area code area), allocates the running stack, starts executes sequentially from the code area.

PHP compiled almost, the php script parsed into opcode, each opcode is C stuct, corresponding to the corresponding machine instructions, the execution process is Zend engine to perform these opcode, the compilation process including lexical analysis, parsing, the PHP version directly generated opcode , PHP7 adds an abstract syntax tree that is generated during the parsing phase, and then generates Opcode_array.

2. Lexical analysis, parsing (PHP Code, Abstract Syntax tree (AST))

PHP uses RE2C, bison to complete this phase of work

re2c: A lexical parser that divides input into meaningful chunks of words, called tokens

Bison: parser to determine how the lexical analyzer split tokens are associated with each other

3.opcode_array structure

the Zend engine will further compile the AST into Zend_op_array , which is the final product of the compile phase and the input to the execution phase. The AST parsing process determines which variables are defined for the current script, and numbers the variables sequentially, which are obtained by this number when used, as well as the initialization value of the variable, the calling Function/class/constant name equivalent (called literal) to Zend_op_ In Array.literals, these literals also have a unique number, so the process is actually to invoke different C functions according to each instruction, and then process the values according to the number of variables, literals, and temporary variables.

The PHP main script will generate a Zend_op_array, each function will be compiled into a separate zend_op_array, so from the point of view of the binary program Zend_op_array contains all the stack information under the current scope, A function call is actually a switch between different zend_op_array.

Structure of the OpCode

struct_zend_op_array {//Common is the field that is used for opcodes fast access for common functions or class member methods, which is detailed when parsing the PHP function implementation.. .. uint32_t*RefCount;    uint32_t This_var;    Uint32_t last; //opcode instruction ArrayZEND_OP *opcodes; //the number of variables defined in the PHP code: Op_type is a variable of IS_CV, not including Is_tmp_var, Is_var//compile before this value is 0, and then find a new variable this value is added 1    intLast_var; //number of temporary variables: Op_type is a variable of Is_tmp_var, Is_varuint32_t t; //PHP variable an array groupZend_string **vars;//This array is a very important step in using the Last_var to determine the number of each variable during the AST compilation .    ...    //static variable symbol table: declared by staticHashTable *Static_variables; ...    //literal quantity    intlast_literal; //literal (constant) arrays, which are some of the values defined in the PHP codeZval *literals; //run-time cache array size    intcache_size; //runtime caching, mainly used to cache some znode_op for fast data acquisition, which is described separately later    void**Run_time_cache; void*reserved[zend_max_reserved_resources];};

Handler is the processing process for each opcode corresponding to the C language, and all Hadler defined in zend_vm_def.h , there are three different forms of delivery: Call, SWITCH, GOTO, and the default mode is call.

Each opcode has two operands, and the operands record the key information for the current instruction (the operand type is actually a 32-bit shape, which is mainly used to store the index position of some variables, value records, etc.).

Each operation has 5 different types Is_const: literal, Is_tmp_var: Temp variable, is_var:php variable, is_cv:php script variable, is_unused: Indicates that the operand is useless.

PHP code is not compiled directly into the machine code, but the compilation and execution of the design is consistent with the C program, there are constant areas, variables are also accessed through the offset, there is a virtual execution stack.

Quantities that can be determined at compile time and are not changed are called literals, also known as constants (Is_const), which have been allocated zval in the compilation phase, saved in zend_op_array->literals an array, and accessed by _zend_op_array->literals + 偏移量 reading.

4. Abstract syntax Tree (AST) compilation (Ast-> Zend_op_array)

Zend_api Zend_op_array *compile_file (Zend_file_handle *file_handle,inttype) {Zend_op_array*op_array = NULL;//the compiled opcodes    ...    if(Open_file_for_scanning (file_handle) ==failure) {//File open failed        ...    } Else{zend_bool original_in_compilation=CG (in_compilation); CG (in_compilation)=1; CG (AST)=NULL; CG (Ast_arena)= Zend_arena_create (1024x768* +); if(!zendparse ()) {//Syntax parsingZval Retval_zv; Zend_file_context Original_file_context; //Save the original Zend_file_contextZend_oparray_context Original_oparray_context;//saves the original Zend_oparray_context, which is used to record the total size of the opcodes, VARs, and so on of the current Zend_op_array during compilationZend_op_array *original_active_op_array =CG (Active_op_array); Op_array= Emalloc (sizeof(Zend_op_array));//Assigning Zend_op_array StructuresInit_op_array (Op_array, zend_user_function, initial_op_array_size);//Initialize Op_arrayCG (Active_op_array) = Op_array;//points the currently compiling Op_array to the currentZval_long (&retval_zv,1); if(zend_ast_process) {zend_ast_process (CG (AST)); } zend_file_context_begin (&original_file_context);//Initialize CG (file_context)Zend_oparray_context_begin (&original_oparray_context);//Initialize CG (context)ZEND_COMPILE_TOP_STMT (CG (AST));//Ast->zend_op_array Compilation ProcessZend_emit_final_return (&RETVAL_ZV);//set the last return valueOp_array->line_start =1; Op_array->line_end =CG (Zend_lineno);            Pass_two (Op_array); Zend_oparray_context_end (&original_oparray_context); Zend_file_context_end (&original_file_context); CG (Active_op_array)=Original_active_op_array;    }        ...    } ...    returnOp_array;}

There are several operations that save the original value in the Compile_file () operation, because this function is not executed only once in PHP script execution, it is called the first time when the main script executes, and the include and require are called, so the current value needs to be saved and then restored back after execution.

Ast->zend_op_array compilation is done in zend_compile_top_stmt () , which is the total entry and is called multiple times recursively:

//zend_compile.cvoidZEND_COMPILE_TOP_STMT (Zend_ast *AST) {    if(!AST) {        return; }    if(Ast->kind = = zend_ast_stmt_list) {//It must have been this type in the first time .Zend_ast_list *list =zend_ast_get_list (AST);        uint32_t i;  for(i =0; I < list->children; ++i) {zend_compile_top_stmt (list->child[i]);//The list child statements are independent of each other and are recursively compiled        }        return; }    //Each statement compiles the entryzend_compile_stmt (AST); if(Ast->kind! = Zend_ast_namespace && Ast->kind! =Zend_ast_halt_compiler)    {Zend_verify_namespace (); }    //function , class two case processing, very critical one-step operation, the following analysis function, class implementation of the chapter further detailed analysis    if(Ast->kind = = Zend_ast_func_decl | | ast->kind = =Zend_ast_class) {CG (Zend_lineno)= ((ZEND_AST_DECL *) AST)End_lineno; Zend_do_early_binding (); //VERY IMPORTANT!!!    }}

First compile from the root node of the AST, the root node type is zend_ast_stmt_list, this type indicates that there are several independent nodes under the current node, each child is a separate statement generation node, so compile it sequentially until it reaches the active node location (not Zend_ast_ Stmt_list node), and then call to zend_compile_stmt compile the current node:

voidZEND_COMPILE_STMT (Zend_ast *AST) {CG (Zend_lineno)= ast->Lineno; Switch(ast->kind) {         CaseXXX: ... Break;  CaseZend_ast_echo:zend_compile_echo (AST);  Break; ...        default: {znode result; ZEND_COMPILE_EXPR (&result, AST); Zend_do_free (&result); }    }        if(FC (declarables). Ticks &&!zend_is_unticked_stmt (AST))    {Zend_emit_tick (); }}

Different processing depending on the node type (kind).

The result of the final compilation is Zend_op_array, the core of which is the compilation of the AST, a key operation of the compile phase is to determine the various variables, intermediate values, temporary values, return values , the literal memory number , this place is very important , which is also used when the execution process is described later.

Implementation of PHP code compilation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.