Optimization of the intersection of PHP arrays
Let's say we're running a mobile-related web site, and users can filter the phones they want by specifying a number of parameters (such as operating system, screen resolution, camera pixels, and so on). But because of the number of phone parameters, and different mobile phone parameters vary greatly, so the parameter table structure is usually a vertical table (a parameter is a row), rather than a horizontal table (a parameter is a column), at this time using a number of parameters to take the results, usually the individual parameters to take the results, and then intersection together.
Suppose that each parameter contains 1000 or so unique results (ID int), which is the precondition for simulating the generation of some data:
<?php $rand = function() { $result = array(); for ($i = 0; $i < 1000; null) { $value = mt_rand(1, 10000); if (!isset($result[$value])) { $result[$value] = null; $i++; } } return array_keys($result); }; $param_a = $rand(); $param_b = $rand(); ?>
|
Note: If the test data set is too small, the conclusion may be inconsistent, let's take a look at the performance of the PHP built-in method Array_intersect implementation:
<?php $time = microtime(true); $result = array_intersect($param_a, $param_b); $time = microtime(true) - $time; echo "array_intersect: {$time}n"; ?>
|
Take a look at the performance of the Intersect implementation through custom methods:
| <?php
function intersect () {
if (Func_num_args () < 2) {
trigger_error (' param error ', e_user_error);
}
$args = Func_get_args ();
foreach ($args as $arg) {
if (!is_array ($arg)) {
Trigger_error (' param error ', e_user_error);
}
}
$intersect = function ($a, $b) {
$result = Array ();
$length _a = count ($a);
$length _b = count ($b);
for ($i = 0, $j = 0; $i < $length _a && $j < $length _b; null) {
if ($a [$i] < $b [$j]) {
$i + +;
else if ($a [$i] > $b [$j]) {
$j + +;
} else {
$result [] = $a [$i];
$i + +;
$j + +;
}
}
return $result;
};
$result = Array_shift ($args);
sort ($result);
foreach ($args as $arg) {
sort ($arg);
$result = $intersect ($result, $arg);
}
return $result;
}
$time = Microtime (true);
$result = intersect ($param _a, $param _b);
$time = Microtime (True)-$time;
echo "intersect: {$time}n";
?> |
Intuitively, we would certainly think that a built-in function is faster than a custom function, but in this case the result is exactly the opposite:
array_intersect:0.023918151855469
intersect:0.0026049613952637
We need to remind you that Array_intersect and intersect are not completely functionally equivalent, as the following examples are:
$param_a = array(1, 2, 2); $param_b = array(1, 2, 3); var_dump( array_intersect($param_a, $param_b), intersect($param_a, $param_b) );
|
Array_intersect:1, 2, 2
Intersect:1, 2
That is, if there is a duplicate element in the first array parameter, then Array_intersect returns all the repeating elements that satisfy the condition, instead of just returning one, interested readers can change the order of the parameters to see the result.
Again, when I first wrote the Intersect method, I probably write the following:
| <?php Func tion intersect () { if (Func_num_args () < 2) { Trigger_error (' param error ', e_user_error); } $args = Func_get_args (); foreach ($args as $arg) { if (!is_array ($arg)) { br> trigger_error (' param error ', e_user_error); } } $result = Array (); $data = array_count_values ( call_user_func_ Array (' Array_merge ', $args) ); foreach ($data as $value => $count) { &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&Nbsp; if ($count > 1) { $result [] = $ Value } } return $result; } |
The code is more concise, but there is a drawback, because the use of array_merge, so when the elements of the array is very large, the memory will be larger, conversely, if the elements of the array is not very much, then this method is feasible.
Reference:faster Array_intersect