Parse Zend VM engine from PHP syntax sugar

Source: Internet
Author: User
Tags vars

1.

Let's start with a php5.3+ grammar sugar, which we usually write about:

<?php    $a = 0;    $b = $a? $a: 1;

Syntax sugar can be written like this:

<?php    $a = 0;    $b = $a?: 1;

Execution results $b = 1, the latter is more concise, but usually not too much syntax sugar, especially easy to understand confusion, such as the new increase in PHP 7?? As follows:

<?php    $b = $a?? 1;

Equivalent:

<?php    $b = isset ($a)? $a: 1;

?: and?? You are not easy to confuse, if so, I suggest rather not, code readable, easy to maintain more important.

Grammar sugar is not the focus of this article, we aim to talk about the parsing principle of Zend VM from the beginning of grammatical sugar.

2.

Analysis of the PHP source branch of the remotes/origin/php-5.6.14, about how to view opcode through VLD, please see my previous writing this article:
Http://www.yinqisen.cn/blog-680.html

<?php    $a = 0;    $b = $a?: 1;

The corresponding opcdoe are as follows:

Number of OPS:  5compiled vars:  !0 = $a,! 1 = $bline     #* E I O op                           fetch          ext  return  operands----- --------------------------------------------------------------------------------   2     0  E >   ASSIGN                                                   ! 0, 0   3     1        jmp_set_var                                      $      !0         2        qm_assign_var                                    $      1         3        ASSIGN                                                   ! 1,   $4     4      > RETURN                                                   1branch: #  0; line:     2-    4; SOP:     0; EOP:     4; OUT1:  -2path #1:0,

Vim Zend/zend_language_parser.y +834

834›   |›  expr '? ' ': ' {zend_do_jmp_set (&$1, &$2, &$3 tsrmls_cc);} 835›   ›   expr     {zend_do_jmp_set_else (&$$, &$5, &$2, &$3 tsrmls_cc);}

If you like, you can do it yourself, redefine it: the syntax of sugar. Follow the BNF grammar rules, using Bison analysis, interested in Google-related knowledge to continue to understand.

From the opcode of VLD, the Zend_do_jmp_set_else is executed and the code is in ZEND/ZEND_COMPILE.C:

void Zend_do_jmp_set_else (Znode *result, const znode *false_value, const znode *jmp_token, const znode *colon_token TSRMLS _DC) {›zend_op *opline = Get_next_op (CG (Active_op_array) tsrmls_cc); ›set_node (Opline->result, colon_token); ›if (Colon_token->op_type = = Is_tmp_var) {››if (False_value->op_type = = Is_var | | false_value->op_type = = IS_CV) {›››CG (Active_op_array)->o Pcodes[jmp_token->u.op.opline_num].opcode = ZEND_JMP_SET_VAR;›››CG (Active_op_array)->opcodes[jmp_token-&gt ; u.op.opline_num].result_type = Is_var;›››opline->opcode = Zend_qm_assign_var;›››opline->result_ty PE = is_var;››} else {›››opline->opcode = zend_qm_assign;››}›} else {››opline->opcode = Zend_qm_assign_var;›}›opline->extended_value = 0;›set_node (OPLINE-&GT;OP1, False_value); ›set_unused (Oplin E-&GT;OP2); ›get_node (result, Opline->result); ›CG (Active_op_array)-&GT;OPCODES[JMP_TOKEN-&GT;U.OP.OPline_num].op2.opline_num = Get_next_op_number (CG (Active_op_array)); ›DEC_BPC (CG (Active_op_array));} 

3.

Focus two Opcode,zend_jmp_set_var and Zend_qm_assign_var, how to read the code? Below is the PHP opcode.

PHP5.6 has 167 opcode, which means you can perform 167 different calculations, official documents see here http://php.net/manual/en/internals2.opcodes.list.php

PHP Internal Use _ZEND_OP this structure to represent opcode, vim zend/zend_compile.h +111

111 struct _zend_op {112›   opcode_handler_t handler;113›   znode_op op1;114› znode_op   op2;115›   znode_op result;116›   ulong extended_value;117›   uint lineno;118›   zend_uchar opcode;119› zend_uchar   op1_type; 120›   Zend_uchar op2_type;121›   Zend_uchar result_type;122}

PHP 7.0 is slightly different, with the main difference being in the uint32_t for 64-bit system uint, explicitly specifying the number of bytes.

You take opcode as a calculator, accept only two operands (OP1, OP2), perform an operation (handler, such as subtraction), and then it returns a result to you, with a little bit of arithmetic overflow (extended_value).

Zend VMs work exactly the same way for each opcode, with a handler (function pointer) that points to the address of the handler function. This is a C function that contains the code that executes the opcode, uses OP1,OP2 as a parameter, executes when done, returns a result (result), and sometimes appends a piece of information (Extended_value).

With the operands in our example Zend_jmp_set_var, vim zend/zend_vm_def.h +4995

4942 Zend_vm_handler (158, Zend_jmp_set_var, const| tmp| var| CV, any)

4942 Zend_vm_handler (158, Zend_jmp_set_var, const| tmp| var|  CV, any) 4943 {4944›use_opline4945›zend_free_op free_op1;4946›zval *value, *ret;49474948›save_opline (); 4949 ›value = Get_op1_zval_ptr (Bp_var_r); 49504951›if (I_zend_is_true (value)) {4952››if (Op1_type = = Is_var | |    Op1_type = = IS_CV) {4953›››z_addref_p (value); 4954›››ex_t (opline->result.var). Var.ptr = value;4955›   ››ex_t (opline->result.var). Var.ptr_ptr = &ex_t (Opline->result.var). var.ptr;4956››} else {4957› ››alloc_zval (ret); 4958›››init_pzval_copy (ret, value); 4959›››ex_t (Opline->result.var). var.pt    R = ret;4960›››ex_t (opline->result.var). Var.ptr_ptr = &ex_t (Opline->result.var). var.ptr;4961››› if (!is_op1_tmp_free ()) {4962››››zval_copy_ctor (ex_t (Opline->result.var). var.ptr); 4963›››}496 4››}4965››free_op1_if_var (); 4966 #if debug_zend>=24967››printf ("CoNditional jmp to%d\n ", Opline->op2.opline_num) 4968 #endif4969 ››zend_vm_jmp (OPLINE-&GT;OP2.JMP_ADDR); 4970›} 49714972›FREE_OP1 (); 4973›check_exception (); 4974›zend_vm_next_opcode (); 4975}

I_zend_is_true to determine whether the operand is true, so Zend_jmp_set_var is a conditional assignment, I believe we can see clearly, the following focus.

Note Zend_vm_def.h This is not a directly compiled C header file, can only be said to be a template, the specific precompiled header is zend_vm_execute.h (this file can have more than 45,000 lines oh), it is not manually generated, but by the zend_vm_ gen.php This PHP script parsing zend_vm_def.h generation (interesting, first chicken or egg, there is no PHP where the script?) ), guess this is a late product, the early PHP version should not use this.

The above Zend_jmp_set_var code, according to different parameters const| tmp| var| CVS eventually generate different types of handler functions that are functionally consistent:

static int Zend_fastcall  Zend_jmp_set_var_spec_const_handler (zend_opcode_handler_args) static int Zend_fastcall  Zend_jmp_set_var_spec_tmp_handler (zend_opcode_handler_args) static int Zend_fastcall  Zend_jmp_set_var_spec_var_handler (zend_opcode_handler_args) static int Zend_fastcall  ZEND_JMP_SET_VAR_SPEC_CV _handler (Zend_opcode_handler_args)

The purpose of this is to determine handler at compile time and to improve the performance of the run time. Do not do, in the run time according to the parameter type choice, also can do, but the performance is not good. Of course, this will sometimes generate some garbage code (seemingly useless), do not worry, C's compiler will further optimize processing.

Zend_vm_gen.php can also accept some parameters, details in the PHP source of the README file Zend/readme. The ZEND_VM is described in detail.

4.

In this case, we know how opcode and handler correspond. But on the whole there is a process, that is, the parsing of the syntax, after the analysis of all the opcode is how to concatenate it?

The details of the parsing is not said, after parsing, there will be a large array of all OPCODE (said linked list may be more accurate), from the above code we can see that each handler after execution, will call Zend_vm_next_opcode (), take out the next OPCODE, Continue execution until the last exit, loop the code vim zend/zend_vm_execute.h + 337:

Zend_api void execute_ex (Zend_execute_data *execute_data tsrmls_dc) {›dcl_opline›zend_bool original_in_execution;› Original_in_execution = EG (in_execution); ›eg (in_execution) = 1;›if (0) {zend_vm_enter:››execute_data = I_creat    E_execute_data_from_op_array (EG (Active_op_array), 1 tsrmls_cc); ›}›load_regs (); ›load_opline (); ›while (1) {› int ret; #ifdef zend_win32››if (EG (timed_out)) {›››zend_timeout (0); ››} #endif ››if (ret = opli Ne->handler (Execute_data tsrmls_cc)) > 0) {›››switch (ret) {››››case 1:›››››eg (i n_execution) = Original_in_execution;›››››return;››››case 2:›››››goto zend_vm_en    Ter;›››››break;››››case 3:›››››execute_data = EG (current_execute_data); ››› ››break;››››default:›››››break;›››}››}›}›zend_error_noreturn (e_er ROR, "arrived at end ofMain loop which shouldn ' t happen ");} 

Macro definition, vim zend/zend_execute.c +1772

1772 #define Zend_vm_next_opcode () \1773›   check_symbol_tables () \1774›   zend_vm_inc_opcode (); \1775›   Zend_vm_continue () 329 #define ZEND_VM_CONTINUE ()         return 0330 #define Zend_vm_return ()           return 1331 #define ZEND _vm_enter ()            return 2332 #define ZEND_VM_LEAVE ()            return 3

While is a dead loop that executes a handler function that, in addition to the individual cases, calls Zend_vm_next_opcode (), Zend_vm_continue () at the end of most handler functions, and return 0 to continue the loop.

Note: For example, the yield association is an exception, it will return 1, and return directly to the loop. We'll have a chance to analyze yield separately.

Hope you read the above content, the PHP Zend engine parsing process has a detailed understanding, below we based on the analysis of the principle, and then simply talk about PHP optimization.

5. PHP Optimization Considerations

5.1 Echo Output

<?php    $foo = ' foo ';    $bar = ' bar ';    Echo $foo. $bar;

VLD View opcode:

Number of OPS:  5compiled vars:  !0 = $foo,! 1 = $barline     #* E I O op                           fetch          ext  return  operands- ------------------------------------------------------------------------------------   2     0  E >   ASSIGN                                                   ! 0, ' foo '   3     1        ASSIGN                                                   ! 1, ' Bar '   4     2        CONCAT      ! 0,! 1         3        ECHO   5     4      > RETURN                                                   1branch: #  0; line:     2-    5; SOP:     0; EOP:     4; OUT1:  -2path #1:0,

Zend_concat connect the values of $a and $b, save to the TEMP variable, and echo out. This process involves allocating a piece of memory for temporary variables, releasing it after use, and calling stitching functions to perform the stitching process.

If you write it this way:

<?php    $foo = ' foo ';    $bar = ' bar ';    Echo $foo, $bar;

The corresponding opcode:

Number of OPS:  5compiled vars:  !0 = $foo,! 1 = $barline     #* E I O op                           fetch          ext  return  operands- ------------------------------------------------------------------------------------   2     0  E >   ASSIGN                                                   ! 0, ' foo '   3     1        ASSIGN                                                   ! 1, ' Bar '   4     2        ECHO                                                     !0         3        ECHO                                                     ! 1   5     4      > RETURN                                                   1branch: #  0; line:     2-    5; SOP:     0; EOP:     4; OUT1:  -2path #1:0,

Does not need to allocate the memory, also does not need to execute the stitching function, is not the efficiency is better! Want to understand the splicing process, according to the content of this article, self-find zend_concat this opcode corresponding handler, do a lot of things oh.

5.2 Define () and const

The Const keyword was introduced from 5.3, which is very different from the define, and the C # # # has the same meaning.

Define () is a function call with a function call overhead.

Const is a keyword that generates opcode directly, which is determined by the compile period and does not need to be allocated dynamically during execution.

The const value is dead and the runtime cannot be changed, so a C-like #define, which is defined during compilation, is limited to numeric types.

Look directly at the code, compare opcode:

Define Example:

<?php    define (' foo ', ' foo ');    Echo FOO;

Define opcode:

Number of OPS:  6compiled vars:  noneline     #* E I O op                           fetch          ext  return  Operands-------------------------------------------------------------------------------------   2     0  E >   send_val                                                 ' foo '         1        send_val                                                 ' foo '         2        do_fcall                                      2          ' define '   3     3        fetch_constant                                   to      ' FOO '         4        ECHO                                                     to   4     5      > RETURN                                                   1

const Example:

<?php    const FOO = ' foo ';    Echo FOO;

Const opcode:

number of ops:4compiled vars:noneline #* E I O op F   Etch ext return operands-------------------------------------------------------------------------------------                                   2 0 E > Declare_const ' foo ', ' foo ' 3 1 fetch_constant ~0 ' FOO ' 2 ECHO ~ 0 4 3 > RETURN 1 

5.3 The cost of dynamic functions

<?php    function foo () {}    foo ();

Corresponding opcode:

Number of OPS:  3compiled vars:  noneline     #* E I O op                           fetch          ext  return  Operands-------------------------------------------------------------------------------------   2     0  E >   NOP   3     1        do_fcall                                      0          ' foo '   4     2      > RETURN                                                   1

Code for dynamic Invocation:

<?php    function foo () {}    $a = ' foo ';    $a ();

OpCode

Number of OPS:  5compiled vars:  !0 = $aline     #* E I O op                           fetch          ext  return  Operands-------------------------------------------------------------------------------------   2     0  E >   NOP   3     1        ASSIGN                                                   ! 0, ' foo '   4     2        init_fcall_by_name                                       !0         3        Do_fcall _by_name                              0   5     4      > RETURN                                                   1

Can vim zend/zend_vm_def.h +2630, see what Init_fcall_by_name do, the code is too long, not listed here. Dynamic characteristics, although convenient, but will certainly sacrifice performance, so before using to balance the pros and cons.

The cost of a deferred declaration of class 5.4

Or look at the code first:

<?php    class Bar {}     class Foo extends bar {}

Corresponding opcode:

Number of OPS:  4compiled vars:  noneline     #* E I O op                           fetch          ext  return  Operands-------------------------------------------------------------------------------------   2     0  E >   NOP   3     1        nop         2        NOP   4     3      > RETURN

Swap declaration Order:

<?php    class Foo extends Bar {}    class Bar {}

Corresponding opcode:

Number of OPS:  4compiled vars:  noneline     #* E I O op                           fetch          ext  return  Operands-------------------------------------------------------------------------------------   2     0  E >   Fetch_class                                   0  : 0      ' Bar '         1        declare_inherited_class                                  '%00foo%2fusers%2fqisen%2ftmp% 2fvld.php0x103d58020 ', ' foo '   3     2        NOP   4     3      > RETURN                                                   1

If you are in a strong language, the latter will produce a compilation error, but the dynamic language of PHP will postpone the declaration of the class to run, if you do not pay attention, it is likely to step on the ray.

Therefore, after we understand the principle of Zend VM, we should pay more attention to the use of dynamic features less, when unnecessary, it must not be used.

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.