In any language, a function is the most basic constituent unit. What are the features of PHP functions? How is a function call implemented? What is the performance of PHP functions and what are the suggestions for using them? This paper will analyze the actual performance test from the principle and try to answer these questions, and better write the PHP program while understanding the implementation. Some common PHP functions are also introduced.
Classification of PHP functions
In PHP, the functions are divided into two main categories: User function (built-in function) and internal function (built-in functions). The former is the user in the program to customize some of the functions and methods, the latter is the PHP itself provides a variety of library functions (such as sprintf, Array_push, etc.). The user can also write library functions by extending the method, which is described later. For the user function, which can be subdivided into functions (function) and method (class method), the three kinds of functions are analyzed and tested separately in this article. Noble Casino
Implementation of PHP functions
How does a PHP function ultimately execute, and what is the process like?
To answer this question, let's take a look at the process through which the PHP code executes.
As can be seen, PHP implements a typical dynamic language execution process: After getting a piece of code, after the lexical parsing, parsing and other stages, the source program will be translated into a single instruction (opcodes), and then zend the virtual machine to execute the instructions in sequence to complete the operation. PHP itself is implemented in C, so the final call is also a C function, in fact, we can think of PHP as a C developed software.
It is easy to see from the above that the execution of functions in PHP is also translated into opcodes to invoke, and each function call actually executes one or more instructions.
For each function, Zend is described by the following data structure:
View Source print?
01 |
typedef union _zend_function { |
02 |
zend_uchar type; /* MUST be the first element of this struct! */ |
04 |
zend_uchar type; /* never used */ |
06 |
zend_class_entry *scope; |
08 |
union _zend_function *prototype; |
10 |
zend_uint required_num_args; |
11 |
zend_arg_info *arg_info; |
12 |
zend_bool pass_rest_by_reference; |
13 |
unsigned char return_reference; |
15 |
zend_op_array op_array; |
16 |
zend_internal_function internal_function; |
18 |
typedef struct _zend_function_state { |
19 |
HashTable *function_symbol_table; |
20 |
zend_function * function ; |
21 |
void *reserved[ZEND_MAX_RESERVED_RESOURCES]; |
22 |
} zend_function_state; |
Where type identifies the type of function: User function, built-in function, overloaded function. Common contains basic information about functions, including function names, parameter information, function flags (common functions, static methods, abstract methods).
Built-in functions
Built-in function, which is essentially a real C function, each built-in function, PHP after the final compilation will be expanded into a function called zif_xxxx, such as our common sprintf, corresponding to the bottom is zif_sprintf. Zend when executing, if found to be built-in functions, it is simply a forwarding operation.
Zend provides a series of APIs for invocation, including parameter fetching, array manipulation, memory allocation, and so on. The parameters of the built-in function are obtained through the Zend_parse_parameters method, and for the parameters such as arrays, strings, and so on, the Zend realizes a shallow copy, so this efficiency is very high. It can be said that for PHP built-in functions, its efficiency and the corresponding C function is almost the same, the only one more forwarding call.
Built-in functions are dynamically loaded in PHP through so, and users can write their own so, which is what we often say, as an extension. Zend provides a range of APIs for extended use.
User functions
Compared with built-in functions, user-defined functions implemented by PHP have completely different execution and implementation principles. As mentioned earlier, we know that PHP code is translated into a opcode to execute, the user function is no exception, in fact, each function corresponds to a set of opcode, this set of instructions are saved in zend_function. Thus, the invocation of the user function is ultimately the execution of the corresponding set of opcodes.
Preservation of local variables and implementation of recursion: we know that function recursion is done through the stack. In PHP, a similar approach is used. Zend assigns an active symbol table (active_sym_table) to each PHP function, recording the state of all local variables in the current function. All symbol tables are maintained in the form of stacks, and each time a function call is assigned, a new symbol table is allocated to the stack. When the call ends, the current symbol table is out of the stack. This enables the preservation and recursion of the state.
For stack maintenance, Zend is optimized here. A static array of length n is pre-allocated to simulate the stack, and this method of simulating dynamic data structures by static arrays is often used in our own programs, which avoids the memory allocation and destruction of each invocation. Zend simply clean off the symbol table data at the top of the current stack at the end of the function call.
Because the static array length is n, once the function call level exceeds n, the program does not appear stack overflow, in this case Zend will be the symbol table allocation, destruction, which will result in a lot of performance degradation. In Zend, the current value of N is 32. Therefore, when we write PHP programs, the function call hierarchy is best not more than 32. Of course, if it is a Web application, it can call the depth of the hierarchy itself.
Parameter delivery: Unlike the built-in function call Zend_parse_params to get the parameters, the parameter acquisition in the user function is done by instruction. A function has several parameters that correspond to several instructions. Specific to the implementation is the normal variable assignment. The above analysis shows that, compared with the built-in function, because it is to maintain the stack table, and the execution of each instruction is a C function, the performance of the user function is relatively poor, there will be a specific comparative analysis. Therefore, if a function has a corresponding PHP built-in function implementation, try not to re-write the function to implement.
Class method
The class method is executed in the same way as the user function, and is also translated into opcodes sequential invocation. Class implementation, Zend with a data structure zend_class_entry to implement, which holds some basic information related to the class. This entry is already processed when PHP is compiled.
In Zend_function's common, there is a member called scope, which points to the zend_class_entry of the current method's corresponding class. About the object-oriented implementation of PHP, here is not to do more detailed introduction, in the future will be dedicated to write an article to detail the object-oriented implementation of PHP principles. As far as the function is concerned, the method implementation principle and function are exactly the same, in theory its performance is similar, we will do detailed performance comparison later.
Implementation and introduction of common PHP functions
Count is a function that we often use, and its function is to return the length of an array.
What is the complexity of the count function? A common argument is that the Count function traverses the entire array and then evaluates the number of elements, so the complexity is O (n). Is that the reality?
We return to the implementation of count to see, through the source can be found, for the count operation of the array, the final path of the function is zif_count-> php_count_recursive-> zend_hash_num_elements, While Zend_hash_num_elements's behavior is return ht->nnumofelements, it is visible that this is an O (1) operation instead of O (n). In fact, the array at the bottom of PHP is a hash_table, for the hash table, Zend has a special element nnumofelements record the current number of elements, so for the general count actually directly returned this value. Thus, we conclude that count is the complexity of O (1), independent of the size of the specific array.
Variables of non-array type, what is the behavior of count? Returns 0 for a variable that is not set, and 1 for an int, double, string, and so on.
The strlen is used to return the length of a string. So, how does his principle of implementation work?
We all know that in C, Strlen is an O (n) function that iterates through a string until it encounters a. Is this also true in PHP? The answer is no, the string in PHP is described in a composite structure, including pointers to specific data and string lengths (similar to strings in C + +), so strlen directly returns the length of the string, which is a constant-level operation.
In addition, calling strlen for a variable of a non-string type, it is important to note that it first casts the variable to a string and then asks for a length.
- Isset and Array_key_exists
The most common use of these two functions is to determine whether a key exists in the array. But the former can also be used to determine if a variable has been set. As mentioned earlier, Isset is not a real function, so its efficiency is much higher than the latter. It is recommended to replace Array_key_exists.
Both are appending an element to the tail of the array. The difference is that the former can push multiple at a time. Their biggest difference is that one function is a language structure, so the latter is more efficient. So if it's just a normal append element, we recommend using array[].
Both provide the ability to generate random numbers, the former using the LIBC standard rand. The latter uses the known characteristics of the Mersenne Twister as a random number generator, which can produce random values four times times faster than Rand () provided by LIBC. Therefore, if the performance requirements are high, consider replacing the former with Mt_rand.
As we all know, Rand produces pseudo-random numbers, and in c it is necessary to display the specified seed with Srand. However, in PHP, Rand will help you by default to call Srand, and generally do not need to display their own calls.
It is important to note that if you need to call Srand in special cases, make sure that you call the package. That is to say srand for Rand,mt_srand correspondence Srand, must not mix use, otherwise is invalid.
Both are used for sorting, but the former can specify a sort strategy, similar to the qsort and C + + sort in our C.
In the order of the two are implemented by the standard fast, for the ordering requirements, such as non-special cases call PHP to provide these methods can be, do not have to re-implement again, the efficiency is much lower. The reason for this is the analysis of the user function and the built-in function in the previous article.
- UrlEncode and Rawurlencode
Both are used for URL encoding, except-_ in the string. All non-alphanumeric characters are replaced with a percent (%) followed by a two-digit hexadecimal number. The only difference between the two is that for spaces, UrlEncode is encoded as +, and Rawurlencode is encoded as%20.
In general, in addition to search engines, our strategy is to encode the space as%20. So the latter is the majority of the use. Note that the encode and decode series must be used as a companion.
The functions of this series include strcmp, strncmp, strcasecmp, strncasecmp, and the same implementation function as the C function. But there are also differences, because the PHP string is allowed to appear, so in the judgment of the bottom of the use of the MEMCMP series rather than strcmp, theoretically faster.
In addition, because PHP directly can get to the length of the string, so the first check in this respect, in many cases, the efficiency is much higher.
These two functions are functionally similar and not identical, and they must be used with the same attention to their differences.
Is_int: To determine whether a variable type is an integer type, the PHP variable has a field characterization type, so it is an absolute O (1) operation to directly judge the type.
Is_numeric: Determines whether a variable is an integer or a numeric string, that is, in addition to the integer variable will return true, for string variables, if the shape of "1234", "1e4" and so on will be sentenced to true. This time the string will be traversed to determine.
Summary and Suggestions
Through the principle analysis and performance test of the function implementation, we summarize the following conclusions:
- PHP has a relatively expensive function call.
- Function-related information is stored in a large hash_table, each time it is called by the function name in the hash table, so function name length has a certain effect on performance.
- function return reference has no practical meaning.
- Built-in PHP functions are much higher performance than user functions, especially for string class operations.
- class methods, normal functions, static methods are almost identical in efficiency, and there is not much difference.
- Except for the effect of empty function calls, the functions of the built-in function and the same function C function basically similar.
- All parameter passing is a shallow copy of the reference count, at a very low cost.
- The performance impact of the number of functions can be almost negligible.
Therefore, for the use of PHP functions, there are some suggestions:
- A function can be done with built-in functions, try to use it instead of writing PHP functions yourself.
- If a feature has high performance requirements, consider extending it.
- PHP function calls are expensive, so do not encapsulate them too much. Some features, if you need to call a lot of itself and only 1, 2 lines of code on the line implementation, it is recommended not to encapsulate the call.
- Do not indulge in a variety of design patterns, as described in the previous article, excessive encapsulation will bring performance degradation. The tradeoff between the two needs to be considered. PHP has its own characteristics, must not parody, too much copy of the Java model.
- Functions should not be nested too deep, recursive use to be cautious.
- The performance of pseudo-function is high, and the same function is given priority. For example, replace array_key_exists with Isset.
- The function return reference does not make much sense and does not make a practical difference, and it is recommended not to be considered.
- Class member methods are less efficient than normal functions, so there is no need to worry about performance loss. It is recommended to consider static methods, which are more readable and more secure.
- In the case of special needs, parameter passing suggests using a pass-through instead of a reference. Of course, reference passing can be considered if the parameter is a large array and needs to be modified.
Explore PHP's function-running mechanism