Parse Zend VM engine from PHP syntax sugar

Source: Internet
Author: User
Tags vars
# # 1.

Let's start with a php5.3+ grammar sugar, which we usually write about:

<?php
$a = 0;
$b = $a? $a: 1;

Syntax sugar can be written like this:

<?php
$a = 0;
$b = $a?: 1;

Execution results $b = 1, the latter is more concise, but usually not too much syntax sugar, especially easy to understand confusion, such as the new increase in PHP 7?? As follows:

<?php
$b = $a?? 1;

Equivalent:

<?php
$b = Isset ($a)? $a: 1;

?: and?? You are not easy to confuse, if so, I suggest rather not, code readable, easy to maintain more important.

Grammar sugar is not the focus of this article, we aim to talk about the parsing principle of Zend VM from the beginning of grammatical sugar.

# # 2.

Analysis of the PHP source branch of the remotes/origin/php-5.6.14, about how to view opcode through VLD, please see my previous writing this article:

<?php
$a = 0;
$b = $a?: 1;

The corresponding opcdoe are as follows:

Number of Ops:5
Compiled VARs:!0 = $a,! 1 = $b
Line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > ASSIGN! 0, 0
3 1 Jmp_set_var $!0
2 Qm_assign_var $1
3 ASSIGN! 1, $
4 4 > RETURN 1

Branch: # 0; line:2-4; sop:0; Eop:4; OUT1:-2
Path #1:0,

Vim Zend/zend_language_parser.y +834

~~~.bash
834›|›expr '? ' ': ' {zend_do_jmp_set (&$1, &$2, &$3 tsrmls_cc);}
835››expr {zend_do_jmp_set_else (&$$, &$5, &$2, &$3 tsrmls_cc);}
~~~

If you like, you can do it yourself, redefine it: the syntax of sugar. Follow the BNF grammar rules, using Bison analysis, interested in Google-related knowledge to continue to understand.

From the opcode of VLD, the Zend_do_jmp_set_else is executed and the code is in ZEND/ZEND_COMPILE.C:

~~~.java
void Zend_do_jmp_set_else (Znode *result, const znode *false_value, const znode *jmp_token, const znode *colon_token TSRMLS _DC)
{
›zend_op *opline = Get_next_op (CG (Active_op_array) tsrmls_cc);

›set_node (Opline->result, Colon_token);
›if (Colon_token->op_type = = Is_tmp_var) {
››if (False_value->op_type = = Is_var | | false_value->op_type = = IS_CV) {
›››CG (active_op_array)->opcodes[jmp_token->u.op.opline_num].opcode = Zend_jmp_set_var;
›››CG (active_op_array)->opcodes[jmp_token->u.op.opline_num].result_type = IS_VAR;
›››opline->opcode = Zend_qm_assign_var;
›››opline->result_type = Is_var;
››} else {
›››opline->opcode = zend_qm_assign;
› › }
›} else {
››opline->opcode = Zend_qm_assign_var;
› }
›opline->extended_value = 0;
›set_node (OPLINE-&GT;OP1, False_value);
›set_unused (OPLINE-&GT;OP2);

›get_node (result, opline->result);

›CG (active_op_array)->opcodes[jmp_token->u.op.opline_num].op2.opline_num = Get_next_op_number (CG (active_op _array));

›DEC_BPC (CG (Active_op_array));
}
~~~

# # 3.

Focus two Opcode,zend_jmp_set_var and Zend_qm_assign_var, how to read the code? Below is the PHP opcode.

PHP5.6 has 167 opcode, which means you can perform 167 different calculation operations, official documents see here
PHP Internal Use _ZEND_OP this structure to represent opcode, vim zend/zend_compile.h +111

111 struct _zend_op {
112›opcode_handler_t handler;
113›znode_op OP1;
114›znode_op OP2;
115›znode_op result;
116›ulong Extended_value;
117›uint Lineno;
118›zend_uchar opcode;
119›zend_uchar Op1_type;
120›zend_uchar Op2_type;
121›zend_uchar result_type;
122}

PHP 7.0 is slightly different, with the main difference being in the uint32_t for 64-bit system uint, explicitly specifying the number of bytes.

You take opcode as a calculator, accept only two operands (OP1, OP2), perform an operation (handler, such as subtraction), and then it returns a result to you, with a little bit of arithmetic overflow (extended_value).

Zend VMs work exactly the same way for each opcode, with a handler (function pointer) that points to the address of the handler function. This is a C function that contains the code that executes the opcode, uses OP1,OP2 as a parameter, executes when done, returns a result (result), and sometimes appends a piece of information (Extended_value).

With the operands in our example Zend_jmp_set_var, vim zend/zend_vm_def.h +4995

4942 Zend_vm_handler (158, Zend_jmp_set_var, const| tmp| var| CV, any)
4943 {
4944›use_opline
4945›zend_free_op Free_op1;
4946›zval *value, *ret;
4947
4948›save_opline ();
4949›value = Get_op1_zval_ptr (Bp_var_r);
4950
4951›if (I_zend_is_true (value)) {
4952››if (Op1_type = = Is_var | | Op1_type = = IS_CV) {
4953›››z_addref_p (value);
4954›››ex_t (opline->result.var). var.ptr = value;
4955›››ex_t (opline->result.var). Var.ptr_ptr = &ex_t (Opline->result.var). Var.ptr;
4956››} else {
4957›››alloc_zval (ret);
4958›››init_pzval_copy (ret, value);
4959›››ex_t (opline->result.var). var.ptr = ret;
4960›››ex_t (opline->result.var). Var.ptr_ptr = &ex_t (Opline->result.var). Var.ptr;
4961›››if (!is_op1_tmp_free ()) {
4962››››zval_copy_ctor (ex_t (Opline->result.var). var.ptr);
4963›››}
4964››}
4965››free_op1_if_var ();
4966 #if debug_zend>=2
4967››printf ("Conditional jmp to%d\n", opline->op2.opline_num);
4968 #endif
4969››ZEND_VM_JMP (OPLINE-&GT;OP2.JMP_ADDR);
4970›}
4971
4972›FREE_OP1 ();
4973›check_exception ();
4974›zend_vm_next_opcode ();
4975}

I_zend_is_true to determine whether the operand is true, so Zend_jmp_set_var is a conditional assignment, I believe we can see clearly, the following focus.

Note ' Zend_vm_def.h ' This is not a directly compiled C header file, can only be said to be a template, the concrete can be compiled with the head of ' zend_vm_execute.h ' (this file can have more than 45,000 lines oh), it is not manually generated, but by the ' zend_vm_ gen.php ' This php script parsing ' zend_vm_def.h ' after generation (interesting, first chicken or egg, no PHP where the script?) ), guess this is a late product, the early PHP version should not use this.

The above Zend_jmp_set_var code, according to different parameters ' const| tmp| var| CV ' eventually generates different types of handler functions that are functionally consistent:

static int Zend_fastcall Zend_jmp_set_var_spec_const_handler (Zend_opcode_handler_args)
static int Zend_fastcall Zend_jmp_set_var_spec_tmp_handler (Zend_opcode_handler_args)
static int Zend_fastcall Zend_jmp_set_var_spec_var_handler (Zend_opcode_handler_args)
static int Zend_fastcall Zend_jmp_set_var_spec_cv_handler (Zend_opcode_handler_args)

The purpose of this is to determine handler at compile time and to improve the performance of the run time. Do not do, in the run time according to the parameter type choice, also can do, but the performance is not good. Of course, this will sometimes generate some garbage code (seemingly useless), do not worry, C's compiler will further optimize processing.

Zend_vm_gen.php can also accept some parameters, details in the PHP source of the README file ' Zend/readme. ZEND_VM ' is explained in detail.

# # 4.

In this case, we know how opcode and handler correspond. But on the whole there is a process, that is, the parsing of the syntax, after the analysis of all the opcode is how to concatenate it?

The details of the parsing is not said, after parsing, there will be a large array of all OPCODE (said linked list may be more accurate), from the above code we can see that each handler after execution, will call Zend_vm_next_opcode (), take out the next OPCODE, Continue execution until the last exit, loop the code vim zend/zend_vm_execute.h + 337:

~~~.java
Zend_api void execute_ex (Zend_execute_data *execute_data tsrmls_dc)
{
›dcl_opline
›zend_bool original_in_execution;



›original_in_execution = EG (in_execution);
›eg (in_execution) = 1;

›if (0) {
Zend_vm_enter:
››execute_data = I_create_execute_data_from_op_array (EG (Active_op_array), 1 tsrmls_cc);
› }

›load_regs ();
›load_opline ();

›while (1) {
›INT ret;
#ifdef ZEND_WIN32
››if (EG (timed_out)) {
›››zend_timeout (0);
› › }
#endif

››if (ret = Opline->handler (execute_data tsrmls_cc)) > 0) {
›››switch (ret) {
››››case 1:
›››››eg (in_execution) = original_in_execution;
›››››return;
››››case 2:
›››››goto Zend_vm_enter;
›››››break;
››››case 3:
›››››execute_data = EG (Current_execute_data);
›››››break;
››››default:
›››››break;
› › › }
› › }

› }
›zend_error_noreturn (E_error, "arrived at end of main loop which shouldn ' t happen");
}
~~~

Macro definition, vim zend/zend_execute.c +1772

1772 #define Zend_vm_next_opcode () \
1773›check_symbol_tables () \
1774›zend_vm_inc_opcode (); \
1775›zend_vm_continue ()

329 #define ZEND_VM_CONTINUE () return 0
#define Zend_vm_return () RETURN 1
331 #define Zend_vm_enter () return 2
332 #define ZEND_VM_LEAVE () return 3

While is a dead loop that executes a handler function that, in addition to the individual cases, calls Zend_vm_next_opcode (), Zend_vm_continue () at the end of most handler functions, and return 0 to continue the loop.

> Note: For example, the yield association is an exception, it will return 1, and return directly to the loop. We'll have a chance to analyze yield separately.

Hope you read the above content, the PHP Zend engine parsing process has a detailed understanding, below we based on the analysis of the principle, and then simply talk about PHP optimization.

# # 5. PHP Optimization Considerations

# # # 5.1 echo Output

<?php
$foo = ' Foo ';
$bar = ' Bar ';
Echo $foo. $bar;

VLD View opcode:

Number of Ops:5
Compiled VARs:!0 = $foo,! 1 = $bar
Line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > ASSIGN! 0, ' foo '
3 1 ASSIGN! 1, ' Bar '
4 2 CONCAT! 0,!1
3 ECHO
5 4 > RETURN 1

Branch: # 0; line:2-5; sop:0; Eop:4; OUT1:-2
Path #1:0,

Zend_concat connect the values of $a and $b, save to the TEMP variable, and echo out. This process involves allocating a piece of memory for temporary variables, releasing it after use, and calling stitching functions to perform the stitching process.

If you write it this way:

<?php
$foo = ' Foo ';
$bar = ' Bar ';
Echo $foo, $bar;

The corresponding opcode:

Number of Ops:5
Compiled VARs:!0 = $foo,! 1 = $bar
Line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > ASSIGN! 0, ' foo '
3 1 ASSIGN! 1, ' Bar '
4 2 ECHO!0
3 ECHO!1
5 4 > RETURN 1

Branch: # 0; line:2-5; sop:0; Eop:4; OUT1:-2
Path #1:0,

Does not need to allocate the memory, also does not need to execute the stitching function, is not the efficiency is better! Want to understand the splicing process, according to the content of this article, self-find zend_concat this opcode corresponding handler, do a lot of things oh.

# # # 5.2 define () and const

The Const keyword was introduced from 5.3, which is very different from the define, and the C language ' #define ' has the same meaning.

* Define () is a function call with a function call overhead.
* Const is a keyword that generates opcode directly, which is determined by the compilation period and does not need to be allocated dynamically during the execution period.

The const value is dead and the runtime cannot be changed, so a C-like #define, which is defined during compilation, is limited to numeric types.

Look directly at the code, compare opcode:

Define Example:

<?php
Define (' foo ', ' foo ');
Echo FOO;

Define opcode:

Number of Ops:6
Compiled Vars:none
Line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > send_val ' FOO '
1 Send_val ' foo '
2 Do_fcall 2 ' define '
3 3 fetch_constant to ' FOO '
4 ECHO
4 5 > RETURN 1

const Example:

<?php
const FOO = ' Foo ';
Echo FOO;

Const opcode:

Number of Ops:4
Compiled Vars:none
Line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > Declare_const ' foo ', ' foo '
3 1 fetch_constant ~0 ' FOO '
2 ECHO ~0
4 3 > RETURN 1

# # # 5.3 The cost of dynamic functions

<?php
function foo () {}
Foo ();

Corresponding opcode:

Number of Ops:3
Compiled Vars:none
Line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > NOP
3 1 Do_fcall 0 ' foo '
4 2 > RETURN 1

Code for dynamic Invocation:

<?php
function foo () {}
$a = ' Foo ';
$a ();

OpCode

Number of Ops:5
Compiled VARs:!0 = $a
Line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > NOP
3 1 ASSIGN! 0, ' foo '
4 2 Init_fcall_by_name!0
3 Do_fcall_by_name 0
5 4 > RETURN 1

Can vim zend/zend_vm_def.h +2630, see what Init_fcall_by_name do, the code is too long, not listed here. Dynamic characteristics, although convenient, but will certainly sacrifice performance, so before using to balance the pros and cons.

The cost of a deferred declaration of the # # 5.4 class

Or look at the code first:

<?php
Class Bar {}
Class Foo extends Bar {}

Corresponding opcode:

Number of Ops:4
Compiled Vars:none
Line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > NOP
3 1 NOP
2 NOP
4 3 > RETURN 1

Swap declaration Order:

<?php
Class Foo extends Bar {}
Class Bar {}

Corresponding opcode:

Number of Ops:4
Compiled Vars:none
Line #* E I O op fetch ext return operands
-------------------------------------------------------------------------------------
2 0 E > Fetch_class 0:0 ' Bar '
1 Declare_inherited_class '%00foo%2fusers%2fqisen%2ftmp%2fvld.php0x103d58020 ', ' foo '
3 2 NOP
4 3 > RETURN 1

If you are in a strong language, the latter will produce a compilation error, but the dynamic language of PHP will postpone the declaration of the class to run, if you do not pay attention, it is likely to step on the ray.

Therefore, after we understand the principle of Zend VM, we should pay more attention to the use of dynamic features less, when unnecessary, it must not be used.
  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.