I. Overview of the AUTOLOAD mechanism
When developing a system using the OO pattern in PHP, it is often customary to store implementations of each class in a separate file, which makes it easy to reuse classes and facilitates future maintenance. This is also one of the basic ideas of OO design. Before PHP5, if you need to use a class, just use Include/require to include it in it.
The following is a practical example:
Copy Code code as follows:
* Person.class.php * *
<?php
Class Person {
var $name, $age;
function __construct ($name, $age)
{
$this->name = $name;
$this->age = $age;
}
}
?>
* no_autoload.php * *
<?php
Require_once ("Person.class.php");
$person = new Person ("Altair", 6);
Var_dump ($person);
In this case, the no-autoload.php file needs to use the person class, which uses require_once to include it, and then it can instantiate an object directly using the person class.
But as the scale of the project expands, there are some hidden problems with this approach: if a PHP file needs to use many other classes, then a lot of require/include statements are required, which may cause omission or include unnecessary class files. If a large number of files require the use of other classes, it must be a nightmare to ensure that each file contains the correct class file.
PHP5 provides a solution to this problem, which is the automatic loading (autoload) mechanism of the class. The autoload mechanism can make it possible for a PHP program to automatically include class files when using classes, rather than including all of the class files in the first place, a mechanism known as lazy loading.
The following is an example of using the autoload mechanism to load the person class:
Copy Code code as follows:
* autoload.php * *
<?php
function __autoload ($classname) {
Require_once ($classname. "class.php");
}
$person = new Person ("Altair", 6);
Var_dump ($person);
?>
Typically, when a class is used, the __autoload () function is automatically run if the class is found to be PHP5, in which we can load the classes that need to be used. In our simple example, we directly add the class name with the extension ". class.php" to the class file name and then load it with require_once. From this example, we can see that autoload at least three things, the first thing is to determine the class file name based on the class name, the second thing is to determine the disk path where the class file is located (in our example is the simplest case, the class and the PHP program files that call them in the same folder), The third thing is to load the class from the disk file into the system. The third step is the simplest, only need to use include/require can be. To achieve the first step, the second step of the function, must be in the development of the class name and disk file mapping method, only then we can according to the class name to find its corresponding disk files.
Therefore, when there are a large number of class files to include, we only have to determine the appropriate rules, and then in the __autoload () function, the class name and the actual disk file, you can achieve the effect of lazy loading. From here we can also see that the most important implementation of the __autoload () function is the implementation of the class name and the actual disk file mapping rules.
But now the problem is, if in a system implementation, if you need to use a lot of other class libraries, these class libraries may be written by different developers, whose class name and the actual disk file mapping rules are different. If you want to implement the automatic loading of the class library file, you must implement all the mapping rules in the __autoload () function so that the __autoload () function may be very complex or even impossible to implement. The end result may be that the __autoload () function is bloated, which can have a significant negative impact on future maintenance and system efficiency even when implemented. In this case, there is no more simple and clear solution? The answer is of course: no! Before we look at further solutions, let's look at how the autoload mechanism in PHP is implemented.
The implementation of PHP's autoload mechanism
We know that php file execution is divided into two separate processes, the first step is to compile the PHP file into a commonly called opcode byte code sequence (is actually compiled into a byte array called Zend_op_array), the second step is a virtual machine to execute these opcode. All of PHP's actions are implemented by these opcode. Therefore, in order to study the implementation mechanism of AutoLoad in PHP, we compile the autoload.php file into opcode, and then based on these opcode to study what PHP has done in this process:
Copy Code code as follows:
/* autoload.php compiled opcode list, a opdump tool developed using the author
* Generated results can be downloaded to the website http://www.phpinternals.com/the software.
*/
<?php
Require_once ("person.php");
function __autoload ($classname) {
0 NOP
0 RECV 1
if (!class_exists ($classname)) {
1 Send_var!0
2 Do_fcall ' class_exists ' [extval:1]
3 Bool_not $ =>res[~1]
4 Jmpz ~,->8
Require_once ($classname. ". class.php");
5 CONCAT! 0, '. class.php ' =>res[~2]
6 Include_or_eval ~2, require_once
}
7 JMP->8
}
8 return NULL
$p = new Person (' Fred ', 35);
1 Fetch_class ' person ' =>res[:0]
2 new:0 =>res[$1]
3 Send_val ' Fred '
4 Send_val 35
5 Do_fcall_by_name [Extval:2]
6 ASSIGN! 0, $
Var_dump ($p);
7 Send_var!0
8 Do_fcall ' Var_dump ' [extval:1]
?>
In the 10th line of code in autoload.php, we need to instantiate an object for class person. Therefore, the autoload mechanism is bound to be reflected in the opcode of the compiled line. From the opcode generated in line 10th above, we know that when you instantiate an object person, you first execute the fetch_class instruction. We started our exploration journey from the process of PHP to fetch_class instructions.
You can find the following sequence of calls by looking up PHP source code (I'm using the PHP 5.3ALPHA2 version):
Copy Code code as follows:
Zend_vm_handler (109, Zend_fetch_class, ...) (Zend_vm_def.h 1864 Lines)
=> Zend_fetch_class (zend_execute_api.c line 1434)
=>ZEND_LOOKUP_CLASS_EX (ZEND_EXECUTE_API.C 964 lines)
=> zend_call_function (&fcall_info, &fcall_cache) (ZEND_EXECUTE_API.C 1040 lines)
Before the last call, let's take a look at the key parameters of the call:
Copy Code code as follows:
/* Set autoload_function variable value of "__autoload" * *
Fcall_info.function_name = &autoload_function; Ooops, finally found "__autoload."
...
Fcall_cache.function_handler = EG (Autoload_func); Autoload_func!
Zend_call_function is one of the most important functions in Zend engine, and its main function is to execute a user-defined function in a PHP program or a library function of PHP itself. Zend_call_function has two important pointer-shape parameters fcall_info, Fcall_cache, which point to two important structures, one is zend_fcall_info and the other is Zend_fcall_info_cache. Zend_call_function The main workflow is as follows: If the Fcall_cache.function_handler pointer is null, try to find the function named Fcall_info.function_name, if it exists, Executes, or if Fcall_cache.function_handler is not NULL, the function pointed to by the Fcall_cache.function_handler is executed directly.
Now that we know that PHP is instantiating an object (actually implementing an interface, using a class constant or a static variable in a class, when calling a static method in a class, you will first find out if the class (or interface) exists in the system and try to load the class using the autoload mechanism if it does not exist. and the main implementation process of autoload mechanism is:
(1) Check whether the Executor global variable function pointer autoload_func is NULL.
(2) If autoload_func==null, find out if there is a __autoload () function defined in the system, and if not, report the error and exit.
(3) if the __autoload () function is defined, the execution __autoload () attempts to load the class and returns the result of the load.
(4) If Autoload_func is not NULL, the function to which the Autoload_func pointer is directed executes to load the class. Note that the __autoload () function is not checked for definition at this time.
The truth is finally out, PHP provides two ways to implement the automatic loading mechanism, one we have mentioned before, is to use the user-defined __autoload () function, which is usually implemented in the PHP source program, and the other is to design a function that points the autoload_func pointer to it, This is typically implemented in the PHP extension using the C language. If both the __autoload () function is implemented and the Autoload_func is implemented (pointing autoload_func to a PHP function), only the Autoload_func function is executed.
the realization of SPL autoload mechanism
SPL is an abbreviation for the standard PHP library (standard PHP libraries). It is an extension library introduced by PHP5, and its main functions include the implementation of autoload mechanism and including various iterator interfaces or classes. The implementation of the SPL autoload mechanism is achieved by pointing the function pointer Autoload_func to the function that has automatic loading function. SPL has two different functions spl_autoload, Spl_autoload_call, to implement different automatic loading mechanisms by pointing the autoload_func to these two different function addresses.
Spl_autoload is the default automatic load function that SPL implements, its function is relatively simple. It can receive two parameters, the first parameter is $class_name, the class name, the second parameter $file_extensions is optional, represents the extension of the class file, you can specify multiple extensions in $file_extensions, and the display name is separated by semicolons. If it is not specified, it will use the default extension. inc or. php. Spl_autoload first converts $class_name to lowercase, and then searches for $class_name.inc or $class_name.php files in all include path (if you do not specify $file_extensions parameters , and if found, loads the class file. You can manually use Spl_autoload ("person", ". class.php") to load the person class. In fact, it's similar to Require/include, and it can specify multiple extensions.
How to let Spl_autoload function automatically, that is, will autoload_func point to Spl_autoload? The answer is to use the Spl_autoload_register function. The first time you call Spl_autoload_register () in a PHP script without using any arguments, you can point the Autoload_func to Spl_autoload.
We know from the above instructions that the Spl_autoload function is relatively simple and that it is implemented in SPL extensions and we cannot extend its functionality. What if you want to implement your own more flexible automatic loading mechanism? At this time, the Spl_autoload_call function shines debut.
Let's take a look at the wonders of Spl_autoload_call's implementation. Inside the SPL module, there is a global variable autoload_functions, which is essentially a hashtable, but we can simply look at it as a list, and each element in the list is a function pointer to a function that has the function of loading the class automatically. The implementation of the Spl_autoload_call itself is simple, just a simple sequential execution of each function in the list, after each function is executed to determine whether the required class has been loaded, if the load succeeds directly back, no longer continue to execute the list of other functions. If all the functions in this list are completed and the class is not loaded, the spl_autoload_call exits without reporting an error to the user. Therefore, the use of the autoload mechanism, and can not guarantee that the class will be able to correctly automatically load, the key is to see how your automatic load function to achieve.
Then the automatic load function list autoload_functions who is to maintain it? is the Spl_autoload_register function mentioned earlier. It can register the user-defined auto-load function into this list and point the AUTOLOAD_FUNC function pointer to the Spl_autoload_call function (note that there is one exception, which is left to think about). We can also remove the registered function from the Autoload_functions list by using the Spl_autoload_unregister function.
As the previous section said, when the Autoload_func pointer is Non-null, the __autoload () function is not automatically executed, and now Autoload_func is pointing to spl_autoload_call if we want to get __autoload () What should I do if the function works? Of course, use Spl_autoload_register (__autoload) calls to register it with the Autoload_functions list.
Now back to the last question in the first section, we have the solution: implement the respective automatic load function according to each class library different naming mechanism, then use Spl_autoload_register to register it to SPL automatically load function queue. So we don't have to maintain a very complex __autoload function.
Iv. autoload efficiency problems and countermeasures
When using the autoload mechanism, a lot of people's first reaction is to use autoload to reduce system efficiency, even some people simply propose for efficiency do not use autoload. After we understand the principle of autoload implementation, we know that the autoload mechanism itself is not a reason to affect system efficiency, even it may improve system efficiency because it does not load unwanted classes into the system.
So why do a lot of people have an impression that using autoload can reduce system efficiency? In fact, the effect of autoload mechanism efficiency itself is the automatic loading function of user design. If it is not efficient to match the class name to the actual disk file (note, this refers to the actual disk file, not just the filename), the system will have to do a large number of files (which need to be found in the path included in each include path). To determine whether a file exists that requires disk I/O operations, it is well known that disk I/O operations are inefficient, so this is the culprit that makes the autoload mechanism less efficient!
Therefore, when designing a system, we need to define a clear mechanism for mapping the class name to the actual disk file. The simpler and clearer the rule, the more efficient the autoload mechanism is. AutoLoad mechanism is not natural inefficiency, only the misuse of AutoLoad, the design of a bad automatic loading function will lead to the reduction of its efficiency.