PHP kernel exploration variable (1) Zval as the data container, we often need to deal with the variable, whether it is a number, array, string, object or other, therefore, variables are an indispensable basis for languages. This article is the first article on variable exploration in PHP kernel. it mainly introduces the basic knowledge of zval, including the following aspects:
- Basic structure of Zval
- View zval methods: debug_zval_dump and xdebug
- Zval principle, COW, etc.
Due to the rush in writing, mistakes are inevitable.
I. basic structure of Zval
Zval is one of the most important data structures in PHP (another important data structure is hash table). It contains information about variable values and types in PHP. It is a struct with the basic structure:
struct _zval_struct { zvalue_value value; /* value */ zend_uint refcount__gc; /* variable ref count */ zend_uchar type; /* active type */ zend_uchar is_ref__gc; /* if it is a ref variable */};typedef struct _zval_struct zval;
Where:
1. zval_value value
The actual value of the variable, specifically the union of zvalue_value (union ):
typedef union _zvalue_value { long lval; /* long value */ double dval; /* double value */ struct { /* string */ char *val; int len; } str; HashTable *ht; /* hash table value,used for array */ zend_object_value obj; /* object */} zvalue_value;
2. zend_uint refcount _ gc
This value is actually a counter used to store the number of variables (or symbols, symbols, all symbols exist in the symbol table (symble table). different scopes use different symbol tables, this point will be discussed later) pointing to this zval. When a variable is generated, its refcount = 1. a typical value assignment operation, for example, $ a = $ B, will increase the refcount of zval by 1, while the unset operation will subtract 1. Before PHP5.3, the reference counting mechanism was used to implement GC. if the refcount of a zval is less than 0, the Zend Engine considers that no variable points to the zval, therefore, the memory occupied by the zval will be released. However, things are sometimes not that simple. Later we will see that the reference counting mechanism alone cannot GC the zval of circular references, even if the variable pointing to this zval has been unset, leading to Memory leakage (Memory Leak ).
3. zend_uchar type
This field is used to indicate the actual type of the variable. At the beginning of learning PHP, we know that the variables in PHP include four scalar types (bool, int, float, string) and two composite types (array, object) and two special types (resource and NULL ). Within zend, these types correspond to the following macros (code location: phpsrc/Zend/zend. h ):
#define IS_NULL 0#define IS_LONG 1#define IS_DOUBLE 2#define IS_BOOL 3#define IS_ARRAY 4#define IS_OBJECT 5#define IS_STRING 6#define IS_RESOURCE 7#define IS_CONSTANT 8#define IS_CONSTANT_ARRAY 9#define IS_CALLABLE 10
4. is_ref _ gc
This field is used to mark whether the variable is a referenced variable. For common variables, this value is 0, and for referenced variables, this value is 1. This variable affects zval sharing and separation. We will discuss this later.
As shown in the name, ref_count _ gc and is_ref _ gc are two important fields required by the GC mechanism of PHP. the values of these two fields can be viewed through debugging tools such as xdebug.
II. installation and configuration of xdebug
Xdebug is an open-source PHP performance analysis and debug tool. Although common debugging tools such as var_dump, echo, print, and debug_backtrace are enough for general program debugging, for some complex debugging and performance testing, xdebug is definitely a good helper (other tools such as Xhprof are also excellent ).
The basic environment of this article:
The basic process of installing xdebug is (actually an extension of source code compilation ):
1. download the source code package.
For: http://www.xdebug.org/docs/install
Version downloaded in this article: xdebug-2.6.tar.gz
2. extract
tar xvzf xdebug-2.6.tar.gz
3. run phpize in the xdebug directory.
4../configure configuration
5. Make & make install
This will generate the xdebug. so extension file (zend_extension), located in xdebug/modules
6. load the xdebug extension in php. ini
zend_extension=your-xdebug-path/xdebug.so
7. add the xdebug configuration
xdebug.profiler_enable = onxdebug.default_enable = onxdebug.trace_output_dir="/tmp/xdebug"xdebug.trace_output_name = trace.%c.%pxdebug.profiler_output_dir="/tmp/xdebug"xdebug.profiler_output_name="cachegrind.out.%s"
Here is no longer detailed description of the meaning of each configuration item, see: http://www.xdebug.org/docs/all
Now, the Xdebug extension information (PHP? M, or phpinfo ()):
Now, in your script, you can print Zval information through xdebug_debug_zval:
$a = array( 'test' );$a[] = &$a;xdebug_debug_zval( 'a' );
3. more Zval principles
(Note, this part of the main reference: http://derickrethans.nl/collecting-garbage-phps-take-on-variables.html, author Derick Rethans is an excellent PHP kernel expert, in the world has made many reports, there are related pdf download, here (http://derickrethans.nl/talks.html) there are records of the author's speech every time, many of which are worth further study)
As we have already said, PHP uses the Zval structure to store variables. here we will continue to track more details of zval.
1. when a variable is created, a zval is created.
$str = "test zval";xdebug_debug_zval('str');
Output result:
str: (refcount=1, is_ref=0)='test zval'
When $ str = "test zval"; is used to create a variable, a new symbol (str) is inserted in the symbol table of the current scope. because this variable is a common variable, therefore, a zval container with refcount = 1 and is_ref = 0 is generated. That is to say, it is actually like this:
2. when the variable is assigned to another variable, the refcount value of zval is increased.
$str = "test zval";$str2 = $str;xdebug_debug_zval('str');xdebug_debug_zval('str2');
Output result:
str: (refcount=2, is_ref=0)='test zval'str2: (refcount=2, is_ref=0)='test zval'
At the same time, we can see that the zval structure of str and str2 is the same. Here is actually an optimization made by PHP. because str and str2 are both common variables, they point to the same zval instead of opening a separate zval for str2. By doing so, the memory can be saved to a certain extent. The correspondence between str, str2 and zval is as follows:
3. When unset is used, the corresponding zval refcount value is reduced.
$str = "test zval";$str3 = $str2 = $str;xdebug_debug_zval('str');unset($str2,$str3)xdebug_debug_zval('str');
Result:
str: (refcount=3, is_ref=0)='test zval'str: (refcount=1, is_ref=0)='test zval'
Since unset ($ str2, $ str3) deletes str2 and str3 from the symbol table, after unset, only str points to the zval, as shown in:
If unset ($ str) is executed, zval is cleared from memory because zval's refcount is reduced to 0. This is of course the ideal situation.
But things are not always so optimistic.
4. array variables are very similar to the zval generated by common variables, but they are also quite different.
Unlike scalar variables, compound variables such as arrays and objects generate a zval container for each item when zval is generated. For example:
1 2 3 4 |
$ Ar = array ( 'Id' => 38, 'Name' => 'Shine' ); Xdebug_debug_zval ('ar '); |
The structure of zval is as follows:
ar: (refcount=1, is_ref=0)=array ( 'id' => (refcount=1, is_ref=0)=38, 'name' => (refcount=1, is_ref=0)='shine')
As shown in:
It can be seen that three zval containers (marked in red) are generated in the process of generating the variable $ ar ). For each zval, the rules for increasing or decreasing refcount are the same as those for common variables. For example, we add another element to the array and assign the value of $ ar ['name'] to it:
$ar = array( 'id' => 38, 'name' => 'shine'); $ar['test'] = $ar['name'];xdebug_debug_zval('ar');
Then the printed zval is:
ar: (refcount=1, is_ref=0)=array ( 'id' => (refcount=1, is_ref=0)=38, 'name' => (refcount=2, is_ref=0)='shine', 'test' => (refcount=2, is_ref=0)='shine')
Like normal variables, the two symbrs name and test point to the same zval:
Similarly, when an element is removed from an array, the corresponding symbol is deleted from the symbol table, and the refcount value of the corresponding zval is reduced. Similarly, if zval's refcount value is reduced to 0, the zval will be deleted from the memory:
$ar = array( 'id' => 38, 'name' => 'shine'); $ar['test'] = $ar['name'];unset($ar['test'],$ar['name']);xdebug_debug_zval('ar');
Output result:
ar: (refcount=1, is_ref=0)=array ('id' => (refcount=1, is_ref=0)=38)
5. the emergence of references will complicate the zval rules.
After a reference is added, the situation becomes a little more complex. For example, add a reference to the array:
$a = $array('one');$a[] = &$a;xdebug_debug_zval('a');
Output result:
a: (refcount=2, is_ref=1)=array ( 0 => (refcount=1, is_ref=0)='one', 1 => (refcount=2, is_ref=1)=...)
In the above output ,... Indicates pointing to the original array, so this is a circular reference. As shown in:
Now, we execute the unset operation on $ a. This will delete the corresponding symbol in the symbol table. at the same time, zval's refcount minus 1 (before 2), that is, the current zval should be in this structure:
(refcount=1, is_ref=1)=array ( 0 => (refcount=1, is_ref=0)='one', 1 => (refcount=1, is_ref=1)=...)
That is, the structure shown below:
At this moment, unfortunately something happened!
After Unset, although there is no variable pointing to the zval, the zval cannot be cleared by GC (referring to the GC simply referencing the counting mechanism before PHP5.3), because the refcount of zval is greater than 0. In this way, these zval actually exist in the memory until the request ends (refer to the SAPI lifecycle ). Previously, the memory occupied by these zval could not be used, so it was wasted. In other words, memory leakage caused by unreleased memory.
If this memory leak occurs only once or a few times, it would be okay, but if it is thousands of times of memory leaks, it would be a big problem. Especially in scripts that run for a long time (for example, Daemon, which is always executed in the background and will not be interrupted), the system will "no memory available" because the memory cannot be recycled ".
6. zval separation (Copy on write and change on write)
We have already introduced that in the process of variable assignment, for example, $ B = $ a, in order to save space, it does not open a separate zval for both $ a and $ B, instead, use the shared zval format:
So the question is: how to deal with zval sharing when one of the variables changes?
For such code:
$a = "a simple test";$b = $a; echo "before write:".PHP_EOL;xdebug_debug_zval('a');xdebug_debug_zval('b'); $b = "thss";echo "after write:".PHP_EOL;xdebug_debug_zval('a');xdebug_debug_zval('b');
The printed result is:
before write:a: (refcount=2, is_ref=0)='a simple test'b: (refcount=2, is_ref=0)='a simple test'after write:a: (refcount=1, is_ref=0)='a simple test'b: (refcount=1, is_ref=0)='thss'
At first, the symbol table a and B point to the same zval (the reason for this is to save memory), and then $ B changes, zend checks whether refcount of zval pointed to by B is 1. if it is 1, there is only one symbol pointing to this zval, and zval is changed directly. Otherwise, it indicates that this is a shared zval, which needs to be separated to ensure independent changes do not affect each other. this mechanism is calledCOW? Copy on write. In many scenarios, COW is a relatively efficient strategy.
What about referenced variables?
$a = 'test';$b = &$a;
echo "before change:".PHP_EOL;xdebug_debug_zval('a');xdebug_debug_zval('b');
$b = 12;echo "after change:".PHP_EOL;xdebug_debug_zval('a');xdebug_debug_zval('b');
unset($b);echo "after unset:".PHP_EOL;xdebug_debug_zval('a');xdebug_debug_zval('b');
The output result is:
before change:a: (refcount=2, is_ref=1)='test'b: (refcount=2, is_ref=1)='test'after change:a: (refcount=2, is_ref=1)=12b: (refcount=2, is_ref=1)=12after unset:a: (refcount=1, is_ref=0)=12
It can be seen that after the value of $ B is changed, Zend will check whether zval's is_ref is a reference variable. if it is a reference variable, change it directly. otherwise, execute the zval separation just mentioned. Because $ a and $ B are referenced variables, changing the shared zval actually indirectly changes the value of $. After unset ($ B), variable $ B is deleted from the symbol table.
This also indicates a problem. unset does not clear zval, but just deletes the corresponding symbol from the symbol table. In this way, you can understand a lot of previous questions about reference (we will explore the reference of PHP in the next section ).
References:
- Deep variable reference/separation http://www.laruence.com/2008/09/19/520.html for laruence
- The main references of this paper http://derickrethans.nl/collecting-garbage-phps-take-on-variables.html
- Http://blog.csdn.net/phpkernel/article/details/5732784
- Http://www.jb51.net/article/50080.htm
- Http://www.nowamagic.net/librarys/veda/detail/1442