Php-zend engine analysis of Hello World (ii)

Source: Internet
Author: User
Tags sapi scalar
Php-zend Engine Anatomy of Hello World (ii)

Objective

This time, I'm going around Hello world to expand the execution of the Zend virtual machine. The PHP version of Hello World:

??? Echo ' Hello world ';

?>

The lexical analysis phase of the previous article will parse the above script into a token sequence:

We get a token sequence: T_open_tag, T_echo, t_constant_encapsed_string, '; ', T_close_tag. But how is this token sequence analyzed during the execution of the Zend virtual machine?

Track running tracks

We still start with the command line, and in the DO_CLI function inside the $PHPSRC/SAPI/CLI/PHP_CLI.C receives the command line parameter input (Php-f helloworld.php represents the execution of the helloworld.php file).

We traced the definition of php_execute_script inside the $PHPSRC/MAIN/MAIN.C, followed by the Call of Zend_execute_scripts () , in Zend_execute_ The definition of scripts inside we found:

? EG (active_op_array) =? Zend_compile_file (File_handle, type TSRMLS_CC); Zend_execute (EG (Active_op_array) tsrmls_cc);

First, the file is parsed into opcode intermediate code by Zend_compile_file (This step is parsed by lexical syntax), and then the generated intermediate code is executed with Zend_execute (this is called the runtime).

This is much like the C language of compiling, first compiled into a compilation, and then into machine code, here opcode is similar to the C language compilation process generated in the compilation.

can also extend a train of thought, because each parse PHP file, need to undergo lexical parsing to get the corresponding opcode, in fact, when the script file does not change, the generated opcode do not need to change, so in order to reduce the PHP script execution time, The opcode of the script can be cached (for example, cached in shared memory).

I'm going to give you a flowchart and then, with this flowchart, see what Zend did:

Let's first look at how to compile the opcode.

Lexical grammar analysis->opcode

We know from the last section that we compiled the script file opcode through Zend_compile_file (actually compile_file () <定义在zend zend_language_scanner.c的555行=""> ), actually compiling opcode by zendparse this API.

The syntax parser for PHP is generated using bison, which runs in the $phpsrc/zend directory after installation:

Bison-o Zend_language_parser.c?zend_language_parser.y

The syntax parser ZEND_LANGUAGE_PARSER.C is generated in the Zend directory. And the zendparse here is the yyparse! inside the parser.

We ignore the generated parser and follow the example of Hello World to Bison's declaration file (I remove the statement I don't want to close):

Start:top_statement_list???? {zend_do_end_compilation (tsrmls_c);}; Top_statement_list:top_statement_list? {Zend_do_extended_info (tsrmls_c);} top_statement {handle_interactive ();}|???? /* Empty */;top_statement:statement????????????????????????????? {zend_verify_namespace (tsrmls_c);}; statement:unticked_statement {do_ticks ();}|???? T_string ': ' {Zend_do_label (&$1 tsrmls_cc);}; unticked_statement:|???? T_echo echo_expr_list '; ' Echo_expr_list:echo_expr_list ', ' expr {zend_do_echo (&$3 tsrmls_cc);}|???? Expr???????????????????????? {Zend_do_echo (&$1 tsrmls_cc);}; Expr:r_variable???????????????????????? {$$ = $;}|???? Expr_without_variable????????? {$$ = $;}; expr_without_variable:|???? Scalar??????????????????? {$$ = $;} scalar:|???? Common_scalar?????????????? {$$ = $;}; common_scalar:|???? T_constant_encapsed_string???? {$$ = $;};

Syntax analysis starting from start, the top-down analysis, a PHP script is corresponding to a top_statement_list, and then into each line of a statement statement, found that echo ' Hello world ' is a unticked_ Statement (pay attention to Echo_expr_list's statement, we can also find that the syntax is to support echo ' Hello ', ' World '). Finally, recursion to the t_constant_encapsed_string state ends the parsing of this line. Here we ignore the compiler principle in the parsing phase is how to do backtracking and so on, we focus on the Zend engine itself problems.

The code inside the block "{}" behind the rule is used to handle the action of scanning to this rule, and you can see that echo's execution calls the Zend_do_echo function. In the action declaration block inside we see $$, $1,$2,$3, etc., these correspond to the rule inside the return value, parameter 1, Parameter 2 ..., here the return value and parameters are Yystype type, this type is defined in 43 lines: #define YYSTYPE znode. The definition of Znode is inside Zend_compile.h:

Notice to ZEND_OP this structure, so trace found this is the last of each statement corresponding to the opcode Structure!!!!

The structure of the opcode is very similar to the assembly, an operator, and two operands.

In the Zend engine, each opcode main thing is that handler, a while we will see Zend inside is how to generate this handler. When we get here, hold on and look back, let's see what the opcode of the example of Hello World is.

Install the VLD, then run: php-dvld.active=1 helloworld.php, we can see this php file compiled by the opcode list:

You can see that echo's opcode type is echo, and return has no return value, only one operand, "Hello world".

Now after parsing, we compile the opcode,zend for each statement and put it into a op_array (in fact, a opcode list).

Looking back, let's see what Zend_do_echo did:

First generate a opcode by get_next_op on the last side of the current Op_array, then set its opcode type to Zend_echo, then set its first parameter OP1, while marking the second parameter OP2 is not required (unused).

After so many steps we got a list of op_array, each opcode in the list is bound to its own type, and then we look at how each opcode node binds handler.

Zend_vm_def.h defines the handler of Zend_echo, notice that there are 40, one will need to use, because echo parameters can have several: constants, variables and so on, so corresponding to the different handler

In Zend_vm_execute.h, we define all the handler corresponding to the opcode, and we only focus on echo-related handler and notice the code:

void zend_init_opcodes_handlers(void) { static const opcode_handler_t labels[] = {//40913行 ZEND_ECHO_SPEC_CONST_HANDLER,//41914行 ZEND_ECHO_SPEC_CONST_HANDLER, ZEND_ECHO_SPEC_CONST_HANDLER, ZEND_ECHO_SPEC_CONST_HANDLER, ZEND_ECHO_SPEC_CONST_HANDLER };

Please take a short time to remember the labels and the number of lines here.

The calculation of the last side return statement of the method for obtaining handler is found, according to the opcode of the previous ECHO is 40 (assuming that the type of the two parameter op1,op2 is 0), then its corresponding handler is:

ZEND_OPCODE_HANDLERS[40*25+0*5+0*5] =?zend_opcode_handlers[1000] =?labels[1000] =? Zend_echo_spec_const_handler (how come?) because: 41914 rows-40913 rows -1=1000).

Virtual machine Execution opcode

We've already explained Zend_compile_file a script compiled into a opcode list:

? EG (active_op_array) =? Zend_compile_file (File_handle, type TSRMLS_CC); Zend_execute (EG (Active_op_array) tsrmls_cc);

After this, the Zend engine executes the returned opcode with Zend_execute.

We have positioned the last 337 lines of Zend_execute to zend/zend_vm_execute.h:

As you can see, when the virtual machine executes, it loops through the current list of opcode, then calls the handler of each row opcode, determines what to do next (for example, function calls, etc.) based on the handler return value.

In this article we only focus on what is relevant to Hello World, and we know that ECHO's HANDLER is Zend_echo_spec_const_handler, and by the final position you will find that it calls:

Zend_write = (zend_write_func_t) utility_functions->write_function;

Here the utility_functions contains some basic handler, each SAPI access layer itself modifies the underlying function pointer, for example, in command-line mode, the last call to the

Sapi_cli_single_write:

From the source, we see that the last write operation is to call the Write/fwrite write to the standard output stream (also the terminal screen).

Conclusion

Finally, according to the preceding process, and then the flow chart is:

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.