An in-depth understanding of PHP AutoLoad and spl_autoload automatic loading mechanism
Font: [Increase decrease] Type: Reprint time: 2013-06-05 I want to comment
This article is a detailed analysis of PHP autoload and spl_autoload automatic loading mechanism, the need for a friend reference
PHP autoload mechanism detailed
(1) AutoLoad mechanism overview
When developing a system using the PHP OO model, it is often customary to store the implementation of each class in a separate file, which makes it easy to reuse the classes and facilitate future maintenance. This is also one of the basic ideas of OO design. Before PHP5, if you need to use a class, just use Include/require to include it directly. The following is a practical example:
Copy CodeThe code is as follows:
/* Person.class.php */
<?php
Class Person {
var $name, $age;
function __construct ($name, $age)
{
$this->name = $name;
$this->age = $age;
}
}
?>
/* no_autoload.php */
<?php
Require_once ("Person.class.php");
$person = new Person ("Altair", 6);
Var_dump ($person);
?>
In this example, the no-autoload.php file needs to use the person class, which uses the require_once to include it, and then it can instantiate an object directly using the person class.
But as the size of the project expands, there are some hidden problems with this approach: if a PHP file needs to use many other classes, it will require a lot of require/include statements, which may cause omissions or unnecessary class files. If a large number of files require the use of other classes, it is certainly a nightmare to ensure that each file contains the correct class file.
PHP5 provides a solution to this problem, which is the automatic loading (autoload) mechanism of classes. The autoload mechanism makes it possible for PHP programs to automatically include class files when using classes, rather than having all the class files included in the first place, which is also known as lazy loading.
The following is an example of using the autoload mechanism to load the person class:
Copy CodeThe code is as follows:
/* autoload.php */
<?php
function __autoload ($classname) {
Require_once ($classname. "class.php");
}
$person = new Person ("Altair", 6);
Var_dump ($person);
?>
Usually PHP5 when using a class, if it is found that the class does not load, it will automatically run the __autoload () function, in which we can load the class we need to use. In our simple example, we directly add the class name with the extension "title=" extension "> extension". class.php "to form the class file name, and then use require_once to load it. From this example, we can see that autoload must do at least three things, the first thing is to determine the class name according to the class name, the second thing is to determine the disk path of the class file (in our example is the simplest case, the class and the PHP program files that call them in the same folder), The third thing is to load the class from the disk file into the system. The third step is the simplest, only need to use Include/require. To implement the first step, the second step of the function, you must contract the class name and disk file mapping method at development time, so that we can find its corresponding disk file according to the class name.
Therefore, when there are a large number of class files to include, we just need to determine the corresponding rules, and then in the __autoload () function, the class name and the actual disk file corresponding to the effect of lazy loading can be achieved. From here we can also see that the most important implementation of the __autoload () function is the implementation of the class name and the actual disk file mapping rules.
But now the problem is, if in a system implementation, if you need to use a lot of other class libraries, these class libraries may be written by different developers, whose class names and the actual disk file mapping rules vary. If you want to implement the automatic loading of a class library file, you must implement all the mapping rules in the __autoload () function, so that the __autoload () function can be very complex or even impossible to implement. In the end, the __autoload () function can be bloated, and even if it can be achieved, it will have a significant negative impact on future maintenance and system efficiency. In this case, isn't there a simpler and clearer solution? The answer is of course: no! Before looking at a further solution, let's take a look at how the autoload mechanism in PHP is implemented.
(2) Implementation of PHP's autoload mechanism
We know that php file execution is divided into two separate processes, the first step is to compile the PHP file into a commonly called opcode sequence of bytes (actually compiled into a byte array called Zend_op_array), the second step is a virtual machine to execute these opcode. All of PHP's behavior is implemented by these opcode. Therefore, in order to study the implementation mechanism of AutoLoad in PHP, we compiled the autoload.php file into opcode, and then based on these opcode to study what PHP has done in this process:
/* autoload.php compiled opcode list, which is a opdump tool developed using the author
*/
Copy CodeThe code is as follows:
<?php
Require_once ("person.php");
function __autoload ($classname) {
0 NOP
0 RECV 1
if (!class_exists ($classname)) {
1 Send_var!0
2 Do_fcall ' class_exists ' [extval:1]
3 Bool_not =>res[~1]
4 Jmpz,->8
Require_once ($classname. ". class.php");
5 CONCAT! 0, '. class.php ' =>res[~2]
6 Include_or_eval, require_once
}
7 JMP->8
}
8 RETURN NULL
$p = new Person (' Fred ', 35);
1 Fetch_class ' person ' =>res[:0]
2 new:0 =>res[$1]
3 Send_val ' Fred '
4 Send_val 35
5 Do_fcall_by_name [Extval:2]
6 ASSIGN! 0, $
Var_dump ($p);
7 Send_var!0
8 Do_fcall ' Var_dump ' [extval:1]
?>
In the 10th line of code in autoload.php we need to instantiate an object for the class person. So the autoload mechanism is bound to be reflected in the compiled opcode of the line. From the opcode generated in line 10th above, we know that the fetch_class instruction is executed first when instantiating the object person. We started our exploration journey from PHP to the process of fetch_class instructions.
By looking at the source Code of PHP (I'm using PHP 5.3alpha2 version), I can find the following sequence of calls:
Zend_vm_handler (109, Zend_fetch_class, ...) (Zend_vm_def.h 1864 Lines)
= Zend_fetch_class (zend_execute_api.c 1434 lines)
=>ZEND_LOOKUP_CLASS_EX (ZEND_EXECUTE_API.C 964 lines)
= Zend_call_function (&fcall_info, &fcall_cache) (ZEND_EXECUTE_API.C 1040 lines)
before the final step, let's take a look at the key parameters at the time of the call:
/* Set autoload_function variable value to "__autoload" */
Fcall_info.function_name = &autoload_function; Ooops, finally found "__autoload".
...
Fcall_cache.function_handler = EG (Autoload_func); Autoload_func!
Zend_call_function is one of the most important functions in Zend engine, and its main function is to perform the functions that the user has customized in the PHP program or the library functions of PHP itself. Zend_call_function has two important pointer-shape parameters fcall_info, Fcall_cache, which point to two important structures, one zend_fcall_info and the other zend_fcall_info_cache. Zend_call_function The main workflow is as follows: If the Fcall_cache.function_handler pointer is null, try to find the function named Fcall_info.function_name, if present, If Fcall_cache.function_handler is not NULL, the function pointed to by Fcall_cache.function_handler is executed directly.
Now we know that PHP, when instantiating an object (actually implementing an interface, using a static variable in a class constant or class, invoking a static method in a class), first finds out whether the class (or interface) exists in the system, and attempts to load the class using the autoload mechanism if it does not exist. The main implementation process of the autoload mechanism is:
(1)Checks if the executor global variable function pointer autoload_func is NULL.
(2)If Autoload_func==null, the lookup system has a __autoload () function defined, and if not, reports an error and exits.
(3)If the __autoload () function is defined, execution __autoload () attempts to load the class and returns the loading result.
(4)If Autoload_func is not NULL, the function that the Autoload_func pointer points to is executed directly to load the class. Note that the __autoload () function is not checked for definition at this time.
Finally, the truth is that PHP provides two ways to implement the automatic loading mechanism, a user-defined __autoload () function that we have already mentioned, which is usually implemented in the PHP source program, and the other is to design a function that points the autoload_func pointer to it, This is usually implemented in the PHP extension using the C language. If you implement both the __autoload () function and the Autoload_func (point autoload_func to a PHP function), only the Autoload_func function is executed.
(3) The realization of SPL autoload mechanism
SPL is the abbreviation for the Standard PHP library (PHP libraries). It is an extension library introduced by PHP5, whose main functions include the implementation of the AutoLoad mechanism and the inclusion of various iterator interfaces or classes. The implementation of the SPL autoload mechanism is achieved by pointing the function pointer autoload_func to its own implementation of the function that has the automatic loading function. SPL has two different function spl_autoload, Spl_autoload_call, to implement different automatic loading mechanisms by pointing autoload_func to these two different function addresses.
Spl_autoload is the default auto-load function implemented by SPL, and its functionality is relatively simple. It can receive two parameters, the first parameter is $class_name, the class name, the second parameter $file_extensions is optional, represents the class file extension "title=" extension "> extension, you can $file_ Specify multiple extension "title=" extension "> Extension" in extensions, separated by semicolons, if not specified, it will use the default extension "title=" extension "> extension. inc or. php. Spl_autoload first turns $class_name into lowercase, and then searches for $class_name.inc or $class_name.php files in all include paths (if you don't specify $file_extensions parameters ), if found, loads the class file. You can manually use Spl_autoload ("person", ". class.php") to load the person class. In fact, it's about the same as require/include. It can specify multiple extension "title=" extension "> extension.
How to let spl_autoload automatically function, that is, Autoload_func point to Spl_autoload? The answer is to use the Spl_autoload_register function. When you call Spl_autoload_register () for the first time in a PHP script, you can point autoload_func to spl_autoload without using any parameters.
With the instructions above, we know that the function of spl_autoload is relatively simple, and it is implemented in the SPL extension, and we cannot extend its functionality. What if you want to implement your own more flexible auto-loading mechanism? At this point, the Spl_autoload_call function shines.
Let's take a look at what's amazing about Spl_autoload_call's implementation. Inside the SPL module, there is a global variable autoload_functions, which is essentially a hashtable, but we can simply consider it as a linked list, and each element in the list is a function pointer to a function that has the automatic load class function. The implementation of the Spl_autoload_call itself is simple, simply sequential execution of each function in the list, after each function is completed to determine whether the required class has been loaded, if the load successfully returned directly, no longer continue to execute the list of other functions. If all functions in this list are executed after the class has not been loaded, spl_autoload_call exits without reporting an error to the user. Therefore, using the AutoLoad mechanism does not guarantee that the class will be able to automatically load correctly, the key is to see how your auto-loading function is implemented.
Then automatically load the function list autoload_functions who is to maintain it? is the Spl_autoload_register function mentioned earlier. It can register the user-defined auto-load function in the list and point the AUTOLOAD_FUNC function pointer to the Spl_autoload_call function (note that there is one case exception, which is what is left to think about). We can also remove the registered function from the autoload_functions linked list by using the Spl_autoload_unregister function.
As mentioned in the previous section, when the Autoload_func pointer is not empty, the __autoload () function is not automatically executed, and now Autoload_func has pointed to Spl_autoload_call, if we want to have __autoload () How should the function work? Or, of course, use Spl_autoload_register (__autoload) call to register it with the Autoload_functions list.
Now back to the end of the first section, we have a solution: Implement the respective auto-load function according to the different naming mechanisms of each class library, and then register it with Spl_autoload_register separately into the SPL auto-load function queue. So we don't have to maintain a very complex __autoload function.
(4) AutoLoad efficiency problems and countermeasures
When using the autoload mechanism, many people's first reaction is to use autoload to reduce the efficiency of the system, and some people simply propose to avoid using autoload for efficiency. After we understand the principle of autoload implementation, we know that the autoload mechanism itself is not the cause of the system's efficiency, and even it may improve the efficiency of the system, because it does not load the unwanted classes into the system.
So why does a lot of people have an impression that using autoload will reduce system efficiency? In fact, the efficiency of the autoload mechanism is the user-designed automatic loading function. If it is not efficient to match the class name to the actual disk file (note that here refers to the actual disk file, not just the file name), the system will have to do a large number of files exist (need to be found in each include path to find) judgment, It is well known that the disk I/O operation is inefficient to determine if the file exists, so this is the culprit that makes the autoload mechanism less efficient!
Therefore, when designing a system, we need to define a clear set of mechanisms for mapping the class name to the actual disk file. The simpler and clearer the rule, the higher the efficiency of the autoload mechanism.
Conclusion:The autoload mechanism is not a natural inefficiency, only the misuse of AutoLoad, the design of a bad automatic loading function will lead to the reduction of its efficiency.
An in-depth understanding of PHP AutoLoad and spl_autoload automatic loading mechanism