Baidu engineers talk about the implementation principle and performance analysis of PHP function (III.)

Source: Internet
Author: User
Tags hash rand reference shallow copy strcmp strlen urlencode alphanumeric characters

This article mainly introduced Baidu engineers talk about the implementation of PHP functions and performance Analysis (c), this article explains the common PHP function implementation and introduction, and made a summary and suggestions, the need for friends can refer to the

Implementation and introduction of common PHP functions


Count is a function we often use, and its function is to return the length of an array.

Count, what is the complexity of this function? A common saying is that the count function traverses the entire array and then gets the number of elements, so the complexity is O (n). Is that the case? Let's go back to Count's implementation to see, through the source code can be found, for the array count operation, the final path of the function is zif_count-> php_count_recursive-> zend_hash_num _elements, and Zend_hash_num_elements's behavior is return ht->nnumofelements, visible, this is an O (1) instead of O (n) operation. In fact, the array at the bottom of PHP is a hash_table, for the hash table, Zend has a special element nnumofelements record the number of the current element, so for general count actually directly returns this value. Thus, we conclude that count is the complexity of O (1) and is independent of the size of the specific array.

A variable of a non-array type, what about the behavior of count? Returns 0 for an not set variable, whereas an int, double, string, and so on, returns 1


Strlen is used to return the length of a string. So what is the principle of his implementation? We all know that in C strlen is an O (n) function that sequentially traverses the string until it is encountered and then out of length. Is this the case in PHP? The answer is no, PHP. Strings are described in a composite structure, including pointers to specific data and string lengths (similar to string in C + +), so strlen directly returns the length of the string, which is a constant-level operation. In addition, for a variable that is not a string type, it will first cast the variable to a string and then strlen the length, which requires attention.

Isset and Array_key_exists

The most common use of these two functions is to determine whether a key exists in an array. But the former can also be used to determine whether a variable has been set. As mentioned earlier, Isset is not a real function, so it is much more efficient than the latter. It is recommended to replace Array_key_exists.

Array_push and array[]

Both are appending an element to the tail of the array. The difference is that the former can push multiple at a time. Their biggest difference is that one is a function and one is a language structure, so the latter is more efficient. Therefore, if it is just an ordinary append element, we recommend that you use array [].

Rand and Mt_rand

Both are provided with the ability to produce random numbers, the former using the LIBC standard of Rand. The latter uses the known characteristics of the Mersenne twister as a random number generator, which can produce a random numerical velocity of four times times faster than the libc provided by Rand (). Therefore, if the performance requirements are high, you can consider replacing the former with Mt_rand. As we all know, Rand produces pseudo random numbers, and in C you need to display the specified seeds with Srand. But in PHP, Rand will help you invoke the Srand by default, and in general you don't need to display it yourself. It is important to note that if you need to invoke Srand in special cases, you must support the call. That is srand for rand,mt_srand corresponding Srand, must not mix use, otherwise is invalid.

Sort and Usort

Both are used for sorting, but the former can specify a sort strategy, similar to the qsort in our C and the sort of C + +. In the sort of both are using the standard of the fast row to achieve, for a sort of demand, such as special circumstances called PHP provided by these methods can be, do not have to do it again, the efficiency will be much lower. The reasons for this are as compared to the analysis of the user function and the built-in function.

UrlEncode and Rawurlencode

Both are used for URL encoding, except for-_ in the string. All non-alphanumeric characters are replaced with a percent semicolon (%) followed by a two-bit hexadecimal number. The only difference is that for spaces, UrlEncode is encoded as +, and Rawurlencode is encoded as%20. In general, in addition to search engines, our strategy is the space code for%20. Therefore, the latter is mostly used. Note that the encode and decode series must be used in matching.

STRCMP Series functions

This series of functions include strcmp, STRNCMP, strcasecmp, and strncasecmp, which are implemented in the same function as the C function. But there are also different, because the PHP string is allowed to appear, so in the judgment of the bottom of the use of the MEMCMP series rather than strcmp, theoretically faster. In addition, because PHP can get the string length directly, so it will first check this aspect, in many cases the efficiency will be much higher.

Is_int and Is_numeric

The functions of these two functions are similar and not exactly the same, and they must be noticed when they are used. Is_int: To determine whether a variable type is an integer type, PHP variable has a special field representation type, so direct judgment of this type can be an absolute O (1) Operation Is_numeric: To determine whether a variable is an integer or a numeric string, That is, in addition to the integer variable returning true, for a string variable, if the shape "1234", "1e4", and so on, will also be sentenced to true. This time will traverse the string to judge.

Summary and Suggestions


Through the principle analysis and performance test of the function realization, we summarize the following conclusions

1. PHP has a relatively large function call overhead.

2. Function-related information is stored in a large hash_table, each time by the function name in the hash table lookup, so the function name length has a certain impact on performance.

3. function return reference has no practical meaning

4. Built-in PHP function performance is much higher than user functions, especially for string class operations.

5. Class method, normal function, static method efficiency is almost the same, not much difference

6. Remove the effects of empty function calls, built-in functions and the same function of the C function is basically the same performance.

7. All parameter passes are a shallow copy of the reference count, with very little cost.

8. The performance impact of the number of functions can be almost ignored


Therefore, for the use of PHP functions, there are some suggestions

1. A feature can be done with built-in functions, try to use it instead of writing PHP functions yourself.

2. If a feature is high performance requirements, you can consider using extensions to implement it.

3. PHP function calls are expensive, so do not encapsulate them too much. Some features, if you need to call the number of times and only 1, 2 lines of code on the line to implement, the recommendation does not encapsulate the call.

4. Do not indulge in a variety of design patterns, such as the previous description, excessive packaging will bring about a decline in performance. The trade-off between the two needs to be considered. PHP has its own characteristics, must not parody, too follow the Java model.

5. Functions should not be nested too deep, the use of recursion should be cautious.

6. Pseudo function performance is very high, the same function to achieve the priority given. Like using Isset instead of array_key_exists.

7. Function return reference does not have much meaning, also does not have the practical effect, the proposal does not consider.

8. Class member methods are less efficient than ordinary functions, so there is no need to worry about performance loss. It is recommended that you consider static methods more readable and secure.

9. Parameter passing, if not a special requirement, suggests using a pass value rather than a reference. Of course, reference passing can be considered if the parameter is a large array and needs to be modified.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.