Php in_array performance problems

Source: Internet
Author: User
Tags ibm developerworks
PHP performance has been improving. However, if it is not used properly or you are not careful, you may step on the internal implementation of PHP. I encountered a performance problem a few days ago.

PHP performance has been improving. However, if it is not used properly or you are not careful, you may step on the internal implementation of PHP. I encountered a performance problem a few days ago.

PHP performance has been improving. However, if it is not used properly or you are not careful, you may step on the internal implementation of PHP. I encountered a performance problem a few days ago.

This is the case. A colleague reported that it took five seconds to return an interface. we review the Code together. The "surprise" was found to be in a loop (about 900 times) A read cache operation is called, and the cache key is not changed. Therefore, we move this code out of the loop and test again. The interface return time is reduced to 2 seconds, whining! Although it is doubled, it is obviously not acceptable!
The amount of code with performance problems is not large. After I/O problems are ruled out, I wrote a piece of test code, which quickly reproduced the problem.

The Code is as follows:


$ Y = "1800 ";
$ X = array ();
For ($ j = 0; $ j <2000; $ j ++ ){
$ X [] = "{$ j }";
}

For ($ I = 0; I I <3000; $ I ++ ){
If (in_array ($ y, $ x )){
Continue;
}
}
?>



Shell $ time/usr/local/php/bin/php test. php

Real 0m1. 132 s
User 0m1. 118 s
Sys 0m0. 015 s

Yes, we use string-type numbers. This is what we get from the cache! Therefore, this is specially converted to a string (if it is a number, this problem will not occur, you can verify it on your own ). It can be seen that it took 1 second to complete the first 3000 cycles, and the following sys times are doomed that we will not get any valid information using strace.

Shell $ strace-ttt-o xxx/usr/local/php/bin/php test. php
Shell $ less xxx



We can only see that the latency between these two system calls is very large, but we don't know what we did? Unfortunately, in addition to strace and ltrace (of course, there are also dtrace and ptrace tools in Linux, which are not discussed in this article ).

Reference: strace is used to track the system call or signal generation of a process, while ltrace is used to track the process call library function (via IBM developerworks ).

To eliminate interference, we assign $ x to array ("0", "1", "2 ″,......) To avoid too many malloc calls affecting the results. Run

Shell $ ltrace-c/usr/local/php/bin/php test. php

2




We can see that the library function _ strtol_internal is frequently called, reaching 94%, which is too exaggerated. Then I checked what the library function _ strtol_internal was doing, it turns out to be the alias of strtol. Simply put, it is to convert the string into an integer. It can be guessed that the PHP engine has detected that this is a string-type number, so it is expected to convert them into a growth integer for comparison, this conversion process consumes too much time and we will execute it again:

The Code is as follows:


Shell $ ltrace-e "_ strtol_internal"/usr/local/php/bin/php test. php



You can easily catch a large number of such calls. At this point, the problem is found. The loose comparison of in_array will convert the two numeric strings into long integers before comparison, but I don't know the performance.



Once we know the crux of the problem, we can solve a lot of problems. The simplest thing is to add the third parameter "true" to in_array, that is, to make a strict comparison and compare the types, this avoids PHP's clever Conversion Type and runs much faster. The Code is as follows:

The Code is as follows:


$ Y = "1800 ";
$ X = array ();
For ($ j = 0; $ j <2000; $ j ++ ){
$ X [] = "{$ j }";
}

For ($ I = 0; I I <3000; $ I ++ ){
If (in_array ($ y, $ x, true )){
Continue;
}
}
?>

The Code is as follows:


Shell $ time/usr/local/php/bin/php test. php

Real 0m0. 267 s
User 0m0. 247 s
Sys 0m0. 020 s



So many times faster !!! As you can see, the time consumed by sys is almost unchanged. When we use ltrace again, we still need to assign a value to $ x to eliminate the interference caused by malloc calls, because we pull it from the cache in actual applications, therefore, there is no such loop in the sample code to apply for memory.
Execute again

The Code is as follows:


Shell $ ltrace-c/usr/local/php/bin/php test. php



For example:

_ Ctype_tolower_loc takes the most time! I checked what the library function _ ctype_tolower_loc does: simply convert the string to lowercase. Does this mean that in_array strings are case insensitive? In fact, this function call is not very relevant to our in_array. For the implementation of in_array, we should look at the PHP source code, which is more thorough to understand. Well, we can't talk about it any more, you are welcome to contact me. Please make a lot of Corrections when writing something wrong.

------- 3.08.29 split line ----------

In the evening, I flipped through the following PHP 5.4.10 source code. I am very interested in in_array. /ext/standard/array. line 3 of c, we can see that he called the php_search_array function. The following array_serach is also called, but the last parameter is different! After some tracking, in the case of loose comparison of in_array, the final called function zendi_smart_strcmp (which is indeed a "smart" function) is compared. /Zend/zend_operators.c, we use ltrace to capture a large number of Integer Conversion operations is the is_numeric_string_ex behavior.



The is_numeric_string_ex function is in. as defined in/Zend/zend_operators.h, after a bunch of judgments and transformations are made, strtol is called in Row 3, which is the system function we mentioned in the article, convert a string into an integer, with a picture showing the truth

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.