Problems
 
Call the Xhprof_enable method multiple times, which is the last configuration that takes effect?
When you call the Xhprof_enable method more than once in a request, only the settings that are made when the first call takes effect. In the call
After Xhprof_disable (), you can also use the Xhprof_enable method to set up.
 
$i = 0;
function Good () {
Global $i;
$i + +;
if ($i < 2) {
Good ();
}
}
function func () {
Good ();
}
$start _time = Microtime (true);
Xhprof_enable (Xhprof_flags_no_builtins);
Xhprof_enable (xhprof_flags_memory + xhprof_flags_cpu + xhprof_flags_no_builtins);
for ($i = 0; $i < $i + +) {
Func ();
}
Good ();
$rst = Xhprof_disable ();
Var_dump ($rst);
 
The output reads:
 
Array (5) {
["Good==>good@1"]=>
Array (2) {
["CT"]=>
Int (1)
["WT"]=>
Int (70)
}
["Func==>good"]=>
Array (2) {
["CT"]=>
Int (50)
["WT"]=>
Int (121)
}
["Main () ==>func"]=>
Array (2) {
["CT"]=>
Int (50)
["WT"]=>
Int (135)
}
["Main () ==>good"]=>
Array (2) {
["CT"]=>
Int (1)
["WT"]=>
Int (0)
}
["Main ()"]=>
Array (2) {
["CT"]=>
Int (1)
["WT"]=>
Int (237)
}
}
 
Visible, printed content, and no CPU and memory information.
 
Ct,wt,cpu,mu in output content, PMU means what
CT indicates the number of calls
WT represents a time consuming function method execution. Equivalent to record a time before the call, after the function method is called, calculate the difference.
The CPU represents the CPU time consumed by the function method execution. The difference with WT is that when the process yields the CPU, it will no longer compute CPU time. Gets CPU-consuming data for the process by calling the system call Getrusage.
MU represents the memory used by the function method. Equivalent to recording a memory footprint before the call, and after the function method call, compute the memory difference. The zend_memory_usage is invoked to get the memory footprint.
PMU represents the peak of memory used by the function method. The zend_memory_peak_usage is invoked to get the memory condition.
 
What does the good==>good@1 mean in the output content?
 
==> represents an invocation relationship. As with @, the description is a recursive call. The number following the @ is the depth of the recursive call.
 
How to set xhprof_enable parameters to reduce performance consumption
 
Xhprof_enable provides three constants to set whether you need to count PHP built-in functions and to count those metrics.
The three constants are as follows:
 
Xhprof_flags_no_builtins
When you set this constant, PHP's built-in functions are not counted. After all, PHP's built-in function performance is generally good. There is no need to consume performance statistics. So, the recommended setting.
 
Xhprof_flags_cpu
When this constant is set, the process is counted to consume CPU time. Because CPU time is obtained by calling the system call Getrusage, the performance is poor. When this option is turned on, the approximate performance is reduced by half. Therefore, it is not recommended to enable this option if CPU time consuming is not particularly sensitive.
 
Xhprof_flags_memory
When this constant is set, the memory footprint is counted. The use of Zend_memory_usage and Zend_memory_peak_usage is not a system call because of the memory gain. Therefore, there is little impact on performance. If you need to analyze the memory usage, you can open it.
 
Principle of performance analysis
 
How to implement performance data recording for each function method
Currently xhprof will load PHP files, execute PHP function methods, and execute eval methods for performance data logging. Exactly, these are in the PHP kernel, there are corresponding functions to handle. When you call the Xhprof_enable method, the default method is replaced with the Xhprof method. Let's take a look at the relevant code.
 
static void Hp_begin (long level, long xhprof_flags)
{
if (!hp_globals.enabled)
{
int hp_profile_flag = 1;
 
hp_globals.enabled = 1;
Hp_globals.xhprof_flags = (UInt32) xhprof_flags;
 
         /* Replace zend_compile with our proxy */
    & nbsp;            /* Process load PHP file */
                  /* First save the Zend engine default processing method to _zend_ In the Compile_file variable. */
         _zend_compile_file = zend_compile_file; 
                /* The corresponding method of Xhprof is assigned to Zend_compile_file. 
                     So, each time the PHP file is loaded, the corresponding Xhprof method is executed. */
         zend_compile_file  = hp_compile_file; 
 
/* Replace zend_compile_string with our proxy * *
/* Handle the execution of the eval code * *
_zend_compile_string = zend_compile_string;
zend_compile_string = hp_compile_string;
 
/*init the Execute pointer*/
/* The execution of the process function method * *
_ZEND_EXECUTE_EX = ZEND_EXECUTE_EX;
ZEND_EXECUTE_EX = HP_EXECUTE_EX;
.........
}
}
 
* * So let's see, Hp_compile_file method, and how to achieve the * *
Zend_dlexport zend_op_array* hp_compile_file (zend_file_handle *file_handle, int type)
{
const char *filename;
Char *func;
int Len;
Zend_op_array *ret;
int hp_profile_flag = 1;
 
filename = Hp_get_base_filename (file_handle->filename);
len = sizeof ("Load")-1 + strlen (filename) + 3;
Func = (char *) emalloc (len);
snprintf (func, Len, "load::%s", filename);
 
Record the current performance, such as CPU memory, before the method executes
Begin_profiling (&hp_globals.entries, Func, Hp_profile_flag);
Start Zend Engine corresponding method, load file
ret = _zend_compile_file (file_handle, type);
if (hp_globals.entries)
{
When the file is loaded, the current performance data is logged again. To calculate the difference later.
End_profiling (&hp_globals.entries, Hp_profile_flag);
}
 
Efree (func);
return ret;
}
 
What is the performance optimization of the xhprof when it is implemented?
 
When acquiring time, for performance purposes, a compilation is used to get the timestamp counter. Time seconds = Timestamp counter value/CPU frequency.
It is this implementation that causes the current xhprof to apply only to the x86 architecture. In addition, because RDTSC data cannot be synchronized between CPUs, Xhprof binds the process to a single CPU.
If the SpeedStep technology is turned on, the XHPROF function based on the RDTSC timer will not work properly. This technology is available on some Intel processors. [Note: Apple desktops and laptops typically turn on the SpeedStep technology preset.] With xhprof, you need to disable SpeedStep technology. ]
 
Inline UInt64 Cycle_timer ()
{
UInt32 __a, __d;
UInt64 Val;
ASM volatile ("RDTSC": "=a" (__a), "=d" (__d));
(val) = ((UInt64) __a) | (((UInt64) __d) << 32);
return Val;
}