Deep understanding of PHP principles opcodes

Source: Internet
Author: User
Tags add numbers php script zend

Opcode is a PHP script-compiled intermediate language , like Java's bytecode, or. Net of MSL.

For example, you write down the following PHP code:

<?php

echo "Hello World";

$a = 1 + 1;

echo $a;

?>

PHP executes this code in the following 4 steps (to be exact, it should be PHP's language engine Zend):

    1. Scanning (lexing), convert PHP code to language fragment (Tokens) (Scan-language fragment)
    2. parsing, convert tokens to simple and meaningful expressions (parse-expression)
    3. compilation, compiling the expression into Opocdes(encoded-opcodes)
    4. execution, executes opcodes sequentially, one at a time, thus realizing the function of PHP script. (executive opcodes)

Now some caches, such as APC, can make PHP cache opcodes, so that every time there is a request, there is no need to repeat the previous 3 steps , which can greatly improve the speed of PHP execution .

Then what is lexing?

Students who have learned the principles of compiling should lexical analysis Steps to understand, Lex is a lexical analysis based on table .

zend/zend_language_scanner.c will be entered according to the zend/zend_language_scanner.l(Lex file ). The PHP code carries out lexical analysis to get a "word ".

PHP4.2 began to provide a function called token_get_all, this function can be a section of PHP code scanning into tokens;

If we use this function to process the PHP code we mentioned at the beginning, we will get the following result:

  1. Array
  2. (
  3. [0] = = Array
  4. (
  5. [0] = 367
  6. [1] = = Array
  7. (
  8. [0] = +
  9. [1] = echo
  10. )
  11. [2] = = Array
  12. (
  13. [0] = 370
  14. [1] = =
  15. )
  16. [3] = = Array
  17. (
  18. [0] = 315
  19. [1] = "Hello World"
  20. )
  21. [4] = = ;
  22. [5] = = Array
  23. (
  24. [0] = 370
  25. [1] = =
  26. )
  27. [6] = =
  28. [7] = = Array
  29. (
  30. [0] = 370
  31. [1] = =
  32. )
  33. [8] = = Array
  34. (
  35. [0] = 305
  36. [1] = 1
  37. )
  38. [9] = = Array
  39. (
  40. [0] = 370
  41. [1] = =
  42. )
  43. [ten] = +
  44. [one] = = Array
  45. (
  46. [0] = 370
  47. [1] = =
  48. )
  49. [+] = Array
  50. (
  51. [0] = 305
  52. [1] = 1
  53. )
  54. [+] = ;
  55. [+] = Array
  56. (
  57. [0] = 370
  58. [1] = =
  59. )
  60. [+] = Array
  61. (
  62. [0] = +
  63. [1] = echo
  64. )
  65. [+] = Array
  66. (
  67. [0] = 370
  68. [1] = =
  69. )
  70. [+] = ;
  71. )

Analysis of the return result we can find that the source of strings, characters, spaces, will be returned as is. Each character in the source code appears in the appropriate order. And, other such as tags, operators, statements, will be converted to a two-part Array:token ID (that is, in the Zend internal change Token of the corresponding code, such as, t_echo,t_string), and the source of the original content.

Next, is the parsing stage, parsing first discards more spaces in the tokens array, and then converts the remaining tokens to a simple expression of one

    1. echo a constant string
    2. Add numbers together
    3. Store the result of the prior expression to a variable
    4. echo a variable

Then change the compilation stage, it will tokens compiled into a op_array, each op_arrayd contains the following 5 parts:

    1. The identification of the opcode number, indicating the type of operation for each op_array, such as Add, echo
    2. Results Store opcode results
    3. Operand 1 to opcode operand
    4. Number of Operations 2
    5. Extended value 1 shaping to differentiate the overloaded operator

For example, our PHP code will be parsing into:

* zend_echo ' Hello world '

* Zend_add ~0 1 1

* Zend_assign!0 ~0

* Zend_echo!0

Oh, you might ask, where is our $ A?

Well, this is about the operand, and each operand consists of the following two parts:

A) Op_type: for Is_const, Is_tmp_var, Is_var, is_unused, or IS_CV

b) U, a consortium that holds the value (const) or Lvalue (Var) of this operand in different types, depending on the op_type.

And for Var, every Var is different.

Is_tmp_var, as the name implies, this is a temporary variable, save some Op_array results, so that the next op_array to use, this operand of u holds a pointer to the variable table of a handle (integer), this operand is generally used to start, such as ~0, A temporary variable that represents the unknown number No. 0 of the variable table

Is_var This is our general sense of the variable, they start with a $ expression

IS_CV says ZE2.1/ PHP5.1 later compiler uses a cache mechanism, this variable holds the address of the variable referenced by it, when a variable is referenced for the first time, it will be CV up, the reference to this variable will not need to find the active symbol table again, CV variable to! The beginning indicates.

So it seems that our $ A is optimized to 0.

Deep understanding of PHP principles opcodes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.