Essential for source code analysis: Use VLD to view opcode Information

Source: Internet
Author: User

The introduction of VLD (Vulcan logic dumper) is as follows:

The Vulcan logic dumper hooks into the Zend engine and dumps all the Opcodes (execution units) of a script. It can be used to see what is going on the Zend engine.

 

In the previous article, the PHP interpreter engine mentioned the VLD principle at the end of the execution process. This extension uses PHP to initialize the hook function (php_rinit_function) for requests provided by the extension module ), when each request arrives, the default compilation function pointer zend_compile_file and execution function pointer zend_execute point to the vld_compile_file and vld_execute functions defined by the user. The original functions are encapsulated, the original compilation function can return an op_array pointer. Therefore, the op_array pointer can be intercepted in the new compilation function and related opcode information can be output.

I will not introduce the installation of the PHP extension module here. There are a lot of related information on the network.

Let's take a look at the actual effect of this extension after installation. Below is a very simple PHP script, test. php:

<? Php <br/> $ A = "Hello World"; <br/> echo $ A; <br/>?>

 

Run the script on the command line:

Php-dvld. Active = 1 test. php

 

The output content of VLD is as follows:

 

You can view the details in the following ways:

Php-dvld. Active = 1-dvld. verbosity = 3 test. php

 

 

Here is a brief description of the output content:

The Code contains three Op types:

1: Assign // # define zend_assign 38

2: Echo // # define zend_echo 40

3: Return // # define zend_return 62

 

The 1st op assign operation handle assigns the OP2 value to OP1, corresponding to the Code $ A = "Hello World", so OP2 is "Hello world, OP1 should be $ A, but what is actually displayed in the output is! 0. In fact, $ A is a compiled variable ,! 0 represents $ A, which can be seen in the previous row of the output op list

Compiled vars :! 0 = $

This optimization can prevent the $ a variable from being searched in the variable symbol table every time and play a certain role in caching. After this op is executed ,! The value of 0 is equal to "Hello World.

 

The 2nd op echo operation handle sends OP1 content to the standard output, corresponding to the echo $ a code, so that "Hello world" is output to the terminal.

 

The 3rd op return is automatically added at the end of each PHP file. Its operation handle returns the constant value of OP1.

 

In this way, we can clearly know what op code a PHP code will get. VLD is really a good analysis tool.

 

 

Some may ask, how do you know the execution handle corresponding to each op? Can VLD output this information? Unfortunately, VLD cannot help us output the execution handle information corresponding to op. In the default call mode, the handler corresponding to each op is a function, and the OP intercepted in VLD contains the handler pointer, however, the corresponding function names cannot be known through these pointers. The C language does not have the reflection features of some more advanced languages. So if you want to know the handler corresponding to each op, you need to find another method. So far, I have found only two ways to get this information. The following describes the two methods.

 

Method 1:

How to execute PHP code in the previous article? As described in, OP handler is defined in {phpsrc}/Zend/zend_vm_execute.h, which is a huge C source file generated by PHP, each handler function definition and the algorithm mapped by OP to handler are provided. In the zend_init_opcodes_handlers function, initialize a static const opcode_handler_t labels [] array, which is a table of handlers, this table has nearly 4000 items, each of which is a handler function pointer. Of course there are a large number of null pointers and some repeated pointers. If we can have an array handler_names corresponding to the labels array, each item in the array corresponds to the function name of the function pointer in the corresponding item in the labels, then we can use the existing op-to-handler ing algorithm to get the handler function name of the op from handler_names. But it is not as easy as we think. How can we correctly generate the handler_names array with 4000 items? The answer is in {phpsrc}/Zend/zend_vm_gen.php, this PHP file is used to generate {phpsrc}/Zend/zend_vm_execute.h, where you can find the part that generates the labels array, as long as you add the relevant code to generate the handler_names array in a similar way. Interested readers can try to generate the handler_names array file and compile it into the VLD extension. when outputting the op list, the handler function names executed by each OP are also output.

 

Method 2:

This method is frequently used by me and is relatively convenient. You can find a solution in the {phpsrc}/Zend/zend_vm_gen.php file. This file generates the handler of each op. If you want to output the handler name in the code of each handler function, you will know which handler is called. This is not too difficult. You can see PHP code similar to the following at around 380th lines of zend_vm_gen.php:

If (0 & strpos ($ code, '{') = 0 ){

...

}

 

In fact, the code in this condition is to output content in the line starting with each handler. However, because the condition is never met, the code in the actual condition cannot be executed. You can change the condition in if to true, then output the function name in braces. The specific code is as follows:

If (1) {<br/> $ name = $ name. ($ spec? "_ Spec ":""). $ prefix [$ OP1]. $ prefix [$ OP2]. "_ handler"; <br/> $ code = "{/n/tfprintf (stderr,/" $ name // n/");/N ". substr ($ code, 1); <br/>}

The specific principle of the Code is not introduced. After zend_vm_gen.php is modified, run the script on the command line to generate a new zend_vm_execute.h (zend_vm_opcodes.h will be generated at the same time) and open the zend_vm_execute.h file, we can see that many functions start with such an extra sentence:

Fprintf (stderr, "Zend _ ***/N ");

In this way, when each function is executed, its name is output to a standard error. The following work is to re-compile Zend/zend_execute.lo and re-link SAPI/CLI/PHP. If you do not know how to complete these operations separately, it can also be more violent to reinstall the entire PHP. It should be noted that the modified PHP should never be used in the official environment, because it will output a large amount of unnecessary information, install a PHP program for the test. In addition, this method also outputs some non-direct hanlder function names. It is possible that a handler will call another function, in this case, the handler name and the called function name may be output, so the actual output function name will be more than the OP number.

Let's use method 2 to view the op handler name of test. php, and run test. php with the modified PHP to get the following content:

 

Zend_assign_spec_cv_const_handler
Zend_echo_spec_cv_handler
Hello worldzend_return_spec_const_handler
Zend_leave_helper_spec_handler

 

We can see that four function names are output. The Handler function is the handler of assign, zend_echo_spec_cv_handler is the handler of ECHO, and zend_return_spec_const_handler is the handler of return. This handler calls the handler function, therefore, the names of the four functions will be output. After knowing the names of these functions, we can find their specific definitions in zend_vm_execute.h, so that we can know how each op is executed.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.