PHP autoload mechanism details
(1) autoload mechanism Overview
When using the php oo mode to develop a system, we usually get used to storing the implementation of each class in a separate file, which makes it easy to reuse classes, at the same time, it will be very convenient for future maintenance. This is also one of the basic ideas of OO design. Before using PHP5, to use a class, you only need to directly include it using include/require. The following is an example:
/* Person. class. php */
<? Php
Class Person {
Var $ name, $ age;
Function _ construct ($ name, $ age)
{
$ This-> name = $ name;
$ This-> age = $ age;
}
}
?>
/* No_autoload.php */
<? Php
Require_once ("Person. class. php ");
$ Person = new Person ("Altair", 6 );
Var_dump ($ person );
?>
In this example, the no-autoload.php file needs to use the Person class, which uses require_once to include it, and then you can directly use the Person class to instantiate an object.
However, as the project scale continues to expand, using this method will bring about some implicit problems: If a PHP file needs to use many other classes, a lot of require/include statements are required, this may cause omissions or include unnecessary class files. If a large number of files require other classes, it is a nightmare to ensure that each file contains the correct class files.
PHP5 provides a solution for this problem, which is the automatic loading mechanism of the class. The autoload mechanism makes it possible for PHP programs to automatically include class files when using classes, rather than include all class files at the beginning. This mechanism is also called lazy loading.
The following is an example of using the autoload mechanism to load the Person class:
/* Autoload. php */
<? Php
Function _ autoload ($ classname ){
Require_once ($ classname. "class. php ");
}
$ Person = new Person ("Altair", 6 );
Var_dump ($ person );
?>
When PHP5 uses a class, if it finds that the class is not loaded, it will automatically run the _ autoload () function. In this function, we can load the class to be used. In our example of this simple ticket, we directly add the class name with the extension ". class. php" to form the class file name, and then use require_once to load it. In this example, we can see that autoload must do at least three things. The first thing is to determine the class file name based on the class name, the second thing is to determine the disk path of the class files (in our example, the simplest case is that the class is in the same folder as the PHP program file that calls them ), the third thing is to load the class from the disk file to the system. The third step is the simplest. You only need to use include/require. To implement the first step and the second step, you must specify the ing method between the class name and the disk file during development. Only in this way can we find the corresponding disk file based on the class name.
Therefore, when a large number of class files need to be included, we only need to determine the corresponding rules, and then in the _ autoload () function, you can map the class name to the actual disk file to achieve the lazy loading effect. Here we can also see that the most important thing in the implementation of the _ autoload () function is the implementation of the ing rule between the class name and the actual disk file.
But now the problem arises. If many other class libraries need to be used in the implementation of a system, these class libraries may be compiled by Different developers, the ing rules of the class name and the actual disk file are different. To automatically load the class library file, you must implement all the ing rules in the _ autoload () function. In this case, _ autoload () functions may be very complex and cannot be implemented. In the end, the _ autoload () function may be very bloated. Even if it can be implemented, it will bring a huge negative impact to future maintenance and system efficiency. In this case, isn't there a simpler and clearer solution? Of course, the answer is: NO! Before looking at the solution, let's take a look at how the autoload mechanism in PHP is implemented.
(2) Implementation of PHP's autoload Mechanism
We know that PHP File Execution is divided into two independent processes. The first step is to compile the PHP file into a bytecode sequence commonly called OPCODE (actually compiled into a byte array called zend_op_array ), the second step is to execute these Opcodes by a virtual machine. All PHP behaviors are implemented by these Opcodes. Therefore, to study the implementation mechanism of autoload in PHP, we compile the autoload. PHP file into opcode, and then study what PHP has done in this process based on the OPCODE:
/* The compiled OPCODE list of autoload. php is an OPDUMP tool developed by the author.
* The generated results can be http://www.phpinternals.com/downloaded on the website.
*/
1: <? Php
2: // require_once ("Person. php ");
3:
4: function _ autoload ($ classname ){
0 NOP
0 RECV 1
5: if (! Class_exists ($ classname )){
1 SEND_VAR! 0
2 DO_FCALL 'class _ exists '[extval: 1]
3 BOOL_NOT $0 => RES [~ 1]
4 JMPZ ~ 1,-> 8
6: require_once ($ classname. ". class. php ");
5 CONCAT! 0, '. class. php' => RES [~ 2]
6. INCLUDE_OR_EVAL ~ 2, REQUIRE_ONCE
7 :}
7 JMP-> 8
8 :}
8 RETURN null
9:
10: $ p = new Person ('fred ', 35 );
1 FETCH_CLASS 'person '=> RES [: 0]
2 NEW: 0 => RES [$1]
3 SEND_VAL 'fred'
4 SEND_VAL 35
5 DO_FCALL_BY_NAME [extval: 2]
6 ASSIGN! 0, $1
11:
12: var_dump ($ p );
7 SEND_VAR! 0
8 DO_FCALL 'var _ dump '[extval: 1]
13:?>
In the 10th line of autoload. php code, we need to instantiate an object for the class Person. Therefore, the autoload mechanism will be reflected in the compiled opcode of this row. From the OPCODE generated from the above 10th lines of code, we know that when instantiating the object Person, we must first execute the FETCH_CLASS command. We will start our exploration from the PHP processing process of the FETCH_CLASS command.
By checking the PHP source code (PHP 5.3alpha2 is used), you can find the following call sequence:
ZEND_VM_HANDLER (109, ZEND_FETCH_CLASS ,...) (Row 1864 of zend_vm_def.h)
=> Zend_fetch_class (zend_execute_API.c row 1434)
=> Zend_lookup_class_ex (row 964 of zend_execute_API.c)
=> Zend_call_function (& fcall_info, & fcall_cache) (row 1040 of zend_execute_API.c)
Before calling the last step, let's take a look at the key parameters during the call:
/* Set the autoload_function variable value to "_ autoload "*/
Fcall_info.function_name = & autoload_function; // Ooops, finally found "_ autoload"
...
Fcall_cache.function_handler = EG (autoload_func); // autoload_func!
Zend_call_function is one of the most important functions of Zend Engine. Its main function is to execute user-defined functions in PHP programs or PHP library functions. Zend_call_function has two important pointer-type parameters fcall_info and fcall_cache, which point to two important structures: zend_fcall_info and zend_fcall_info_cache. The main workflow of zend_call_function is as follows: if the fcall_cache.function_handler pointer is NULL, try to find the number of functions named fcall_info.function_name. If it exists, execute it. If fcall_cache.function_handler is, then, execute the function pointed to by fcall_cache.function_handler.
Now we know that PHP will always do this when instantiating an object (in fact, when implementing an interface and using static variables in a class constant or class, it will call static methods in the class ), first, the system checks whether the class (or interface) exists. If it does not exist, it tries to use the autoload mechanism to load the class. The main execution process of the autoload mechanism is:
(1) check whether the global variable function pointer autoload_func of the executor is NULL.
(2) If autoload_func = NULL, check whether the _ autoload () function is defined in the system. If not, report an error and exit.
(3) If the _ autoload () function is defined, execute _ autoload () to load the class and return the loading result.
(4) If autoload_func is not NULL, The autoload_func pointer is directly executed to load the class. Note that the _ autoload () function is not defined at this time.
The truth is that PHP provides two methods to implement automatic loader. One of the methods we mentioned earlier is to use the User-Defined _ autoload () function, this is usually implemented in the PHP source program; the other is to design a function that points the autoload_func pointer to it, which is usually implemented in PHP extension using the C language. If both the _ autoload () function and autoload_func (pointing autoload_func to a PHP function) are implemented, only the autoload_func function is executed.
(3) Implementation of the SPL autoload Mechanism
SPL is the abbreviation of Standard PHP Library (Standard PHP Library. It is an extension library introduced by PHP5. Its main functions include the implementation of the autoload mechanism and various Iterator interfaces or classes. The implementation of the SPL autoload mechanism is achieved by pointing the function pointer autoload_func to the self-implemented function with the automatic loading function. SPL has two different functions: spl_autoload and spl_autoload_call. Different automatic loading mechanisms are implemented by pointing autoload_func to these two function addresses.
Spl_autoload is the default automatic loading function implemented by SPL, and its functions are relatively simple. It can receive two parameters. The first parameter is $ class_name, indicating the class name. The second parameter $ file_extensions is optional, indicating the class file extension, you can specify multiple extensions in $ file_extensions. The protection names are separated by semicolons. If not specified, the default extension is used. inc or. php. Spl_autoload first converts $ class_name to lowercase, and then searches for the $ class_name.inc or $ class_name.php file in all include paths (if the $ file_extensions parameter is not specified). If yes, load the file. You can manually use spl_autoload ("Person", ". class. php") to load the Person class. In fact, it is similar to require/include. Different, it can specify multiple extensions.
How can we make spl_autoload automatically take effect, that is, direct autoload_func to spl_autoload? The answer is to use the spl_autoload_register function. If you do not use any parameters when calling spl_autoload_register () for the first time in a PHP script, you can direct autoload_func to spl_autoload.
Through the above description, we know that the spl_autoload function is relatively simple, and it is implemented in the SPL extension, and we cannot expand its function. What if I want to implement a more flexible automatic loading mechanism? The spl_autoload_call function is now available.
Let's take a look at what's wonderful about the implementation of spl_autoload_call. In the SPL module, there is a global variable autoload_functions, which is essentially a HashTable, but we can simply think of it as a linked list. every element in the linked list is a function pointer, point to a function with the automatic loading class function. The implementation of spl_autoload_call is very simple, but it simply executes each function in the linked list in order, and judges whether the required class has been loaded once after each function is executed, if the load is successful, the system returns directly and does not continue to execute other functions in the linked list. If all the functions in this linked list have not been loaded after execution, spl_autoload_call will exit without reporting an error to the user. Therefore, using the autoload mechanism does not guarantee that the class can be automatically loaded correctly. The key is how to implement the automatic loading function.
Who will automatically load the function linked list autoload_functions for maintenance? The spl_autoload_register function mentioned above. It can register the User-Defined automatic loading function into this linked list, and direct the autoload_func function pointer to the spl_autoload_call function (Note that there is one exception, the specific situation is left for everyone to think about ). You can also use the spl_autoload_unregister function to delete registered functions from the autoload_functions linked list.
As mentioned in the previous section, when the autoload_func pointer is not empty, the _ autoload () function will not be automatically executed. Now autoload_func has pointed to spl_autoload_call. If we still want _ autoload () what should a function do? Use the spl_autoload_register (_ autoload) Call to register it to the autoload_functions linked list.
Now back to the last question in section 1, we have a solution: implement automatic loading functions based on different naming mechanisms of each class library, then use spl_autoload_register to register it to the SPL auto load function queue. In this way, we do not need to maintain a very complex _ autoload function.
(4) autoload efficiency problems and countermeasures
When using the autoload mechanism, many people's first response is that using autoload will reduce system efficiency, and some people may even suggest not using autoload for efficiency. After learning about the implementation principle of autoload, we know that the autoload mechanism does not affect the system efficiency, or even it may improve the system efficiency, because it does not load unnecessary classes into the system.
So why does autoload reduce system efficiency? In fact, the efficiency of the autoload mechanism is precisely the automatic loading function designed by the user. If it cannot efficiently match the class name with the actual disk file (note that this refers to the actual disk file, not just the file name, the system will have to determine whether a large number of files exist (which needs to be searched in the path contained in each include path), and determine whether a file exists requires disk I/O operations, as we all know, the efficiency of disk I/O operations is very low, so this is the culprit in reducing the efficiency of the autoload system!
Therefore, in system design, we need to define a clear mechanism for ing Class names to actual disk files. The simpler and clearer the rule, the more efficient the autoload mechanism is.
Conclusion: The autoload mechanism is not inherently inefficient. Only the autoload abuse and poor automatic loading functions can reduce the efficiency.