Through the previous study, we learned that a PHP file on the server side of the execution process consists of the following two large processes:
- The PHP program needs to execute the file, PHP program to complete basic preparation work after starting PHP and Zend engine, load the registered extension module.
- After the initialization is completed, the script file is read, and the Zend engine parses the script file, parsing the parser. It is then compiled into opcode execution. If a opcode cache such as APC is installed, the compile link may be skipped and read opcode execution directly from the cache.
In the second step, lexical analysis, parsing, compiling intermediate code, executing intermediate code, etc. are collectively referred to as Zend virtual machines. Compared with the Java, C # and other compiled languages, PHP is less a manual compilation process, they do not need to compile to run, we call it an explanatory language. Java has its own Java Virtual machine, which implements a unified language on multiple platforms; C # has its own. NET virtual machine, which implements multiple languages on a single platform; PHP, like them, has its own zend virtual machine. They are essentially the same, they are abstract computers. These virtual machines are in a lower level of the language to abstract another language, has its own instruction set, has its own memory management system. They eventually translate higher-level language implementations into lower-level language implementations, and implement other ancillary functions such as memory management, garbage collection, and so on, to reduce the programmer's work on specific implementations, thus allowing more time and effort into the business logic. From the level of abstraction, Zend virtual machines are a bit more advanced than those in Java, where the advanced is not to say that they are more powerful or more efficient, and simply say that the Zend virtual machine is farther from the real machine. In recent years, the development of language has been a constant abstraction, constantly moving away from machines, without fundamental changes.
Here we from the virtual machine's past life, the description of the implementation of the Zend virtual machine, the key data structure, and interspersed with an example of the implementation of the syntax and the source code encryption and decryption process description.
The definition of a virtual machine in a wiki is a virtual machine, which, in the architecture of computer science, is a special kind of software that creates an environment between a computer platform and an end user, while the end user operates the software based on the environment created by the software. In computer science, a virtual machine is a software implementation of a computer that can run a program like a real machine.
Virtual machine is an abstract computer, it has its own instruction set, has its own memory management system. The languages implemented on this type of virtual machine are more concise and easy to learn than the lower level of abstraction.
How the php file is parsed, what is generated by the intermediate code, and how the resulting intermediate code corresponds to the actual PHP code, and how the resulting intermediate code is executed? What intermediate data will be in the process of execution? Can the entire virtual machine be optimized? How to optimize?
Zend Virtual Machine Architecture
By abstracting the implementation of the Zend virtual machine from the conceptual layer, we can divide the architecture of the Zend virtual machine into: The interpretation layer, the execution engine, and the intermediate data layer.
Zend Virtual Machine Architecture diagram
When a piece of PHP code enters the Zend virtual machine, it is executed in two steps: Compile and execute. For an explanatory language, this is a creative move, but the present implementation is not exhaustive. Now when the PHP code enters the Zend virtual machine, it will be executed in two steps, but these two operations are continuous for a regular execution, that is, it does not turn into the same compiler language as Java: Generate an intermediate file to store the compiled results. If you do this every time, the performance of the PHP script is a huge loss. Although there are caching solutions similar to Apc,eaccelerator. But there is no change in nature, and it is not possible to separate two steps from each other and grow.
Interpretation Layer
The interpretation layer is where the Zend virtual machine performs the compilation process. It includes lexical parsing, parsing, and compilation to generate three parts of intermediate code. Lexical analysis is the PHP source file we want to execute, remove the space, remove the comment, cut into a token (token), and the hierarchy of the processing program (hierarchical structure).
Parsing is a sequence of accepted tokens (tokens) that performs some action according to the defined grammatical rules, and the bison used by the Zend virtual machine uses the Backus paradigm (BNF) to describe the syntax. Compilation generated intermediate code is based on the results of the parsing of the Zend virtual machine opcode generated intermediate code, in PHP5.3.1, Zend virtual machine Support 135 instructions (see Zend/zend_vm_opcodes.h file), Whether it is a simple output statement or a complex recursive invocation of a program, the Zend virtual machine eventually translates all of the PHP code we have written into sequences of these 135 instructions, which are then executed sequentially in the execution engine.
Intermediate Data Layer
When a Zend virtual machine executes a PHP code, it needs memory to store many things, such as the intermediate Code, PHP's own list of functions, a list of user-defined functions, PHP's own classes, user-defined classes, constants, objects created by the program, arguments passed to functions or methods, return values, Local variables and the intermediate results of some operations. We refer to all of these places where data is stored as intermediate data layers.
If PHP is attached to the APACHE2 server in the form of a mod extension, some of the data in the intermediate data layer may be shared by multiple threads, if PHP comes with a list of functions. If only a single process is considered, when a process is created it will be loaded with PHP's various function lists, class lists, constant lists, and so on. When the interpretation layer compiles the PHP code, various user-defined functions, classes, or constants are added to the previous list, except that some of the fields in their own structure are assigned different values.
When the execution engine executes the generated intermediate code, a new execution intermediate data structure (zend_execute_data) is added to the stack of the Zend virtual machine, which includes a snapshot of the active symbol list of the current execution, some local variables, and so on.
Execution engine
Zend the execution engine of a virtual machine is a very simple implementation, it is only based on the intermediate Code sequence (EX (opline)), step by step call the corresponding method execution. In the execution engine does not have a similar to the PC registers the same variable to hold the next instruction, when the Zend virtual machine executes to an instruction, when all of its tasks are executed, this instruction will call the next instruction itself, the sequence of the pointer to move forward one position, so that the next command to execute, And in the final execution of the return statement, so repeated. This is essentially a function nested call.
Back to the beginning of the problem, PHP through lexical analysis, parsing and intermediate code generation three steps, the PHP file will be parsed into PHP intermediate code opcode. There is not a complete one by one correspondence between the generated intermediate code and the actual PHP code. It just generates intermediate code for the PHP code that the user gives and the syntax rules for PHP and some internal conventions, and these intermediate codes also rely on some global variables to relay data and associations. The execution of the generated intermediate code is based on the smooth execution of the intermediate code, which relies on the global variables in the execution process, step-by-step. Of course, there will also be offsets in the event of some function jumps, but eventually it will return to the offset point.
Extended Reading
The list of topics for this article is as follows:
- PHP Kernel Explorer: Starting with the SAPI interface
- PHP kernel exploration: Start and end of a single request
- PHP kernel exploration: One-time request life cycle
- PHP Kernel Exploration: single-process SAPI life cycle
- PHP kernel Exploration: SAPI lifecycle of multiple processes/threads
- PHP Kernel Explorer: Zend Engine
- PHP Kernel Explorer: Explore SAPI again
- PHP kernel Discovery: Apache module Introduction
- PHP Kernel Explorer: PHP support via MOD_PHP5
- PHP kernel exploration: Apache Run with hook function
- PHP Kernel Explorer: embedded PHP
- PHP Kernel Explorer: PHP fastcgi
- PHP kernel exploration: How to execute PHP scripts
- PHP Kernel Explorer: PHP script execution details
- PHP kernel exploration: opcode opcode
- PHP kernel Explorer: PHP opcode
- PHP kernel exploration: Interpreter execution process
- PHP Kernel Exploration: Variables overview
- PHP kernel exploration: variable storage and type
- PHP Kernel Explorer: Hash table in PHP
- PHP Kernel Exploration: Understanding the hash table in Zend
- PHP kernel exploration: PHP hash Algorithm design
- PHP kernel Exploration: translating an article hashtables
- PHP Kernel exploration: What is a hash collision attack?
- PHP Kernel exploration: implementation of constants
- PHP kernel exploration: Storage of variables
- PHP kernel exploration: Types of variables
- PHP Kernel Explorer: Variable value operation
- PHP Kernel exploration: Creation of variables
- PHP kernel exploration: pre-defined variables
- PHP Kernel Explorer: variable retrieval
- PHP kernel Exploration: Variable type conversion
- PHP Kernel exploration: implementation of weakly typed variables
- PHP Kernel exploration: implementation of static variables
- PHP Kernel Explorer: Variable type hints
- PHP kernel exploration: The life cycle of a variable
- PHP Kernel Exploration: variable assignment and destruction
- PHP Kernel exploration: variable scope
- PHP kernel Explorer: Weird variable names
- PHP Kernel Explorer: variable value and type storage
- PHP Kernel Explorer: global variables
- PHP kernel Exploration: Conversion of variable types
- PHP kernel Exploration: The memory management begins
- PHP Kernel Explorer: Zend Memory manager
- PHP Kernel Explorer: PHP's memory management
- PHP Kernel Exploration: Application and destruction of memory
- PHP Kernel Exploration: reference count vs. write-time replication
- PHP kernel exploration: PHP5.3 garbage collection mechanism
- PHP Kernel Explorer: Cache in memory management
- PHP Kernel Exploration: write-time copy cow mechanism
- PHP kernel Exploration: arrays and Linked lists
- PHP kernel exploration: Using the Hash Table API
- PHP kernel exploration: array manipulation
- PHP kernel Exploration: Array source code Analysis
- PHP Kernel Exploration: Classification of functions
- PHP kernel Exploration: internal structure of functions
- PHP Kernel exploration: function structure transformation
- PHP Kernel Exploration: The process of defining a function
- PHP kernel Exploration: Parameters for functions
- PHP kernel exploration: zend_parse_parameters function
- PHP Kernel exploration: function return value
- PHP kernel exploration: formal parameter return value
- PHP Kernel exploration: function invocation and execution
- PHP kernel exploration: Referencing and function execution
- PHP kernel exploration: anonymous functions and closures
- PHP Kernel Exploration: object-oriented opening
- PHP kernel Exploration: The structure and implementation of classes
- PHP kernel exploration: member Variables for classes
- PHP Kernel Exploration: Member Methods for classes
- PHP Kernel Exploration: class prototype Zend_class_entry
- PHP kernel exploration: Definition of class
- PHP Kernel Explorer: Access control
- PHP kernel exploration: inheritance, polymorphism and abstract classes
- PHP Kernel Exploration: magic function and delay binding
- PHP kernel Exploration: Preserving classes and special classes
- PHP Kernel Explorer: objects
- PHP kernel Exploration: Creating object instances
- PHP Kernel Explorer: Object properties Read and write
- PHP Kernel Exploration: namespaces
- PHP kernel exploration: Defining interfaces
- PHP kernel Exploration: Inheritance and Implementation interface
- PHP Kernel Exploration: resource resource type
- PHP Kernel Explorer: Zend virtual machine
- PHP Kernel Exploration: Lexical parsing of virtual machines
- PHP Kernel Explorer: virtual machine Syntax analysis
- PHP kernel exploration: Execution of intermediate code opcode
- PHP Kernel Exploration: Code encryption and decryption
- PHP kernel exploration: zend_execute specific execution process
- PHP kernel exploration: Reference and counting rules for variables
- PHP kernel exploration: New garbage collection Mechanism description