This article mainly introduces the implementation principle and performance analysis of PHP functions (3) by Baidu engineers. This article describes the implementation and introduction of common php functions, and summarizes and provides suggestions, for more information, see
This article mainly introduces the implementation principle and performance analysis of PHP functions (3) by Baidu engineers. This article describes the implementation and introduction of common php functions, and summarizes and provides suggestions, for more information, see
Common php Functions
Count
Count is a function that we often use. Its function is to return the length of an array.
What is the complexity of the count function? A common saying is that the count function traverses the entire array and finds the number of elements. Therefore, the complexity is O (n ). Is that true? Let's go back to the implementation of count. We can see from the source code that the final path of the function for the count operation on the array is zif_count-> php_count_recursive-> zend_hash_num_elements, the behavior of zend_hash_num_elements is return ht-> nNumOfElements. It can be seen that this is an O (1) operation instead of an O (n) operation. In fact, an array is a hash_table at the bottom layer of php. For a hash table, the zend has an element nNumOfElements that records the number of current elements, therefore, this value is actually directly returned for general count. From this, we can conclude that count is the complexity of O (1) and has nothing to do with the size of a specific array.
What is the count behavior for non-array variables? If no variable is set, 0 is returned, and 1 is returned for variables such as int, double, and string.
Strlen
Strlen is used to return the length of a string. So what is his implementation principle? We all know that strlen is an o (n) function in c. it traverses the string sequentially until \ 0 is encountered and then returns the length. Is the same in Php? The answer is no. Strings in php are described in a composite structure, including pointers to specific data and string lengths (similar to strings in c ++ ), therefore, strlen directly returns the string length, which is a constant-level operation. In addition, if a variable of the non-string type is called strlen, it will first forcibly convert the variable into a string and then calculate the length.
Isset and array_key_exists
The most common use of these two functions is to determine whether a key exists in an array. However, the former can also be used to determine whether a variable has been set. As mentioned above, isset is not a real function, so it is much more efficient than the latter. It is recommended to replace array_key_exists.
Array_push and array []
Both append an element to the end of the array. The difference is that the former can push multiple instances at a time. The biggest difference between them is that the function and the language structure make the latter more efficient. Therefore, if it is only a common append element, we recommend that you use array [].
Rand and mt_rand
Both provide the random number generation function, and the former uses the libc standard rand. The latter uses known features in Mersenne Twister as a random number generator, which can generate a random value with an average speed four times faster than the rand () provided by libc. Therefore, if you have high performance requirements, you can use mt_rand to replace the former. As we all know, rand generates pseudo-random numbers. In C, you need to use srand to display the specified seed. However, in php, rand will help you call srand once by default. Generally, you do not need to display the call. Note that in special cases, you must call srand together. That is to say, srand for rand, mt_srand corresponds to srand, must not be used together, otherwise it is invalid.
Sort and usort
Both are used for sorting. The difference is that the former can specify the sorting policy, similar to the qsort in C and the sort in C ++. In terms of sorting, both adopt standard fast sorting. If you have sorting requirements, you can call the methods provided by php in special cases. You do not have to implement them again, efficiency will be much lower. For the reason, see the preceding analysis and comparison of user functions and built-in functions.
Urlencode and rawurlencode
Both are used for url encoding. All non-alphanumeric characters except-_. In the string will be replaced with a semicolon (%) followed by two hexadecimal numbers. The only difference between the two is that for space, urlencode will be encoded as +, while rawurlencode will be encoded as % 20. In general, except for the search engine, our policies all adopt Space Encoding As % 20. Therefore, the latter is mostly used. Note that the encode and decode series must be used together.
Strcmp Functions
These functions include strcmp, strncmp, strcasecmp, and strncasecmp. The implementation functions are the same as those of C functions. However, there are also differences. Because php Strings allow \ 0 to appear, memcmp series rather than strcmp are used at the underlying layer during judgment, which is faster theoretically. In addition, because php can directly obtain the string length, this check will be performed first, and the efficiency will be much higher in many cases.
Is_int and is_numeric
These two functions are similar and not identical, so you must pay attention to their differences when using them. Is_int: determines whether a variable type is an integer type. php variables have a field to indicate the type. Therefore, you can directly determine this type. Is_numeric is an absolute O (1) operation: determines whether a variable is an integer or a numeric string. That is to say, except for an integer variable, true is returned. For a string variable, such as "1234" and "1e4", true is also returned. At this time, the system will traverse the string for determination.
Summary and Suggestions
Summary:
Through the analysis of function implementation principles and performance tests, we summarize the following conclusions:
1. Php function call overhead is relatively large.
2. function-related information is stored in a large hash_table. During each call, the function name is searched in the hash table. Therefore, the function name length also affects the performance.
3. The reference returned by the function has no practical significance.
4. built-in php functions provide much higher performance than user functions, especially for string operations.
5. Class methods, common functions, and static methods have almost the same efficiency, with no big difference
6. Apart from the impact of empty function calls, the performance of built-in functions is similar to that of C functions of the same function.
7. All parameter transfer uses the reference count of the shortest copy, the cost is very small.
8. The impact of the number of functions on performance is negligible.
Suggestion:
Therefore, we have the following suggestions for using php functions:
1. A function can be completed using built-in functions. Try to use it instead of writing php functions by yourself.
2. If a feature has high performance requirements, you can consider using extensions.
3. Php function calls are costly and therefore should not be overly encapsulated. Some functions, if you need to call a lot of times itself and only use 1 or 2 lines of code to implement it, we recommend that you do not encapsulate the call.
4. Do not be overly infatuated with various design patterns. As described in the previous article, excessive encapsulation may lead to performance degradation. Consider the trade-off between the two. Php has its own characteristics, so it is not feasible to follow the java mode too much.
5. functions should not be nested too deep, so be cautious when using recursion.
6. the pseudo function has high performance, and the implementation of the same function is preferred. For example, replace array_key_exists with isset.
7. The reference returned by the function does not make much sense and does not play a practical role. We recommend that you do not consider it.
8. The efficiency of the class member method is not lower than that of common functions, so you don't have to worry about performance loss. We recommend that you consider static methods for better readability and security.
9. If this is not a special requirement, we recommend that you use pass-through instead of pass-through for parameter passing. Of course, if the parameter is a large array and needs to be modified, you can consider the reference transfer.