The garbage collection mechanism is a dynamic storage allocation scheme. It automatically frees allocated blocks of memory that the program no longer needs. The process of automatically reclaiming memory is called garbage collection. The garbage collection mechanism allows programmers to devote more effort to business logic without being overly concerned with program memory allocations. In today's popular languages, the garbage collection mechanism is a common feature of a new generation of languages, such as Python, PHP, Eiffel, C #, Ruby, and so on, using garbage collection. Although garbage collection is now a more popular practice, but it is not a small age. It was already in the Lisp system developed by MIT in the 1960s, but because of the immaturity of the technology, the garbage collection mechanism became a seemingly beautiful technology, until the advent of Java in the the 1990s, the garbage collection mechanism was widely used.
PHP also implements dynamic memory management in the language layer, which has been described in detail in the previous chapters, and the dynamic management of memory frees developers from cumbersome memory management. With this support, PHP also provides a language layer of garbage collection mechanism, so that programmers do not have to be overly concerned about the program memory allocation.
Prior to the PHP5.3 version, PHP had a simple garbage collection based on reference counting, and when the reference count of a variable changed to 0 o'clock, PHP would destroy the variable in memory, except that the rubbish was not called garbage. and PHP releases the content of this process/thread at the end of a lifecycle, which determines that PHP does not need to think too much about memory leaks in the early stages. But with the development of PHP, the increase of PHP developers and the expansion of the business scope that it carries, introduced a more perfect garbage collection mechanism in PHP5.3. The new garbage collection mechanism resolves a reference memory leak problem that cannot handle loops. The garbage collection mechanism in PHP5.3 uses the synchronization algorithm in the synchronous cycle recycling (Concurrent Cycle Collection in Reference counted) of the article reference counting system. The introduction of this algorithm we will not repeat, in the official PHP document has a illustrated description: Recycling cycle (collecting Cycles).
As mentioned earlier, in PHP, the main memory management means is the reference count, the purpose of the introduction of garbage collection mechanism is to break the reference count in the circular reference, so as to prevent the memory leakage caused by this. The garbage collection mechanism exists based on dynamic memory management in PHP. PHP5.3 to introduce the garbage collection mechanism, there are some changes in the basic structure of the variable storage, as follows:
Copy Code code as follows:
struct _zval_struct {
/* Variable Information * *
Zvalue_value value; /* Value * *
Zend_uint refcount__gc;
Zend_uchar type; /* Active type */
Zend_uchar is_ref__gc;
};
As compared to the previous version of PHP5.3, the reference count segment RefCount and Reference field Is_ref are added __gc behind it for the new garbage collection mechanism. In the PHP source style, a lot of macros are a very distinct feature. These macros are equivalent to an interface layer that masks some of the underlying implementations below the interface layer, such as Alloc_zval macros, which, before PHP5.3, directly invoke PHP's memory management allocation function Emalloc allocate memory, and the size of the allocated memory is determined by the size of the variable. After the introduction of the garbage collection mechanism, the Alloc_zval macro uses the new garbage collection cell structure directly, allocating the same size, all the memory size of the ZVAL_GC_INFO structure, and initializing the garbage collection mechanism of the structure after allocating memory. The following code:
Copy Code code as follows:
/* The following macroses override macroses from Zend_alloc.h * *
#undef Alloc_zval
#define ALLOC_ZVAL (z) \
do {\
(z) = (zval*) emalloc (sizeof (zval_gc_info)); \
Gc_zval_init (z); \
} while (0)
The zend_gc.h file is referenced in the zend.h 749 lines: #include "zend_gc.h" to replace the macros in the new macros that cover alloc_zval in zend_alloc.h files referenced in 237 rows. The key change is the change in the size and distribution of allocated memory, the content of the garbage collection mechanism added to the previously pure memory allocation, and all content is included in the ZVAL_GC_INFO structure:
Copy Code code as follows:
typedef struct _ZVAL_GC_INFO {
Zval Z;
Union {
Gc_root_buffer *buffered;
struct _zval_gc_info *next;
} u;
} Zval_gc_info;
For any variable stored by a Zval container, a zval structure is assigned that ensures that it is aligned with the start of the memory allocated with the Zval variable, so that it can be used as a zval when casting zval_gc_info type pointers. There is a consortium behind the Zval field: U. U includes the buffered field of the Gc_root_buffer structure and the next field of the Zval_gc_info structure. The two fields are the root node that represents the garbage collection mechanism cache, one is the next node of the Zval_gc_info list, and the garbage collection mechanism caches nodes, either as root nodes or as list nodes, can be represented here. Alloc_zval, after allocating memory, calls Gc_zval_init to initialize the zval_gc_info that replaces Zval, which sets the Zval_gc_info field of the member U in buffered to null. This field will have a value only when it is placed in the garbage collection buffer, otherwise it will always be null. Because all variables in PHP are in the form of zval variables, this replaces zval with Zval_gc_info, thus successfully integrating the garbage collection mechanism into the original system.
The PHP garbage collection mechanism is turned on by default in PHP5.3, but we can set the configuration file directly to disable, and its corresponding configuration field: ZEND.ENABLE_GC. There is no default for this field in the php.ini file, and if we need to disable this feature, add zend.enable_gc=0 or Zend.enable_gc=off to the php.ini. In addition to modifying the php.ini configuration zend.enable_gc, you can turn on/off the garbage collection mechanism by calling the Gc_enable ()/gc_disable () function. The effect of the invocation of these functions is the same as modifying the configuration entry to turn on or off the garbage collection mechanism. In addition to these two functions, PHP provides the gc_collect_cycles () function to enforce cycle recycling when the root buffer is not full. and the garbage collection mechanism is open in the PHP source code has some related operations and fields. The following code is available in the ZEND.C file:
Copy Code code as follows:
Static ZEND_INI_MH (onupdategcenabled)/* {{* * *
{
Onupdatebool (Entry, New_value, New_value_length, Mh_arg1, Mh_arg2, MH_ARG3, stage);
if (Gc_g (gc_enabled)) {
Gc_init (Tsrmls_c);
}
return SUCCESS;
}
/* }}} */
Zend_ini_begin ()
Zend_ini_entry ("error_reporting", NULL, Zend_ini_all, onupdateerrorreporting)
Std_zend_ini_boolean ("zend.enable_gc", "1", Zend_ini_all, onupdategcenabled, gc_enabled, Zend_gc_globals, gc_globals )
#ifdef Zend_multibyte
Std_zend_ini_boolean ("Detect_unicode", "1", Zend_ini_all, Onupdatebool, Detect_unicode, Zend_compiler_globals, Compiler_globals)
#endif
Zend_ini_end ()
ZEND.ENABLE_GC the corresponding action function is ZEND_INI_MH (onupdategcenabled), and if the garbage collection mechanism is turned on, that is, Gc_g (gc_enabled) is true, the gc_ is invoked The INIT function performs an initialization operation of the garbage collection mechanism. The Gc_init function zend/zend_gc.c 121 rows, this function determines whether the garbage collection mechanism is turned on, and, if it is turned on, initializes the entire mechanism, that is, directly call malloc to allocate 10,000 gc_root_buffer memory space to the entire cache list. Here's 10000 is hard-coded in the code, to the macro gc_root_buffer_max_entries exist, if you need to modify this value, you need to modify the source, recompile PHP. The Gc_init function calls the Gc_reset function after the memory is allocated to reset some global variables used by the entire mechanism, such as setting the number of GC runs (gc_runs) and the number of garbage in the GC (collected) to 0. Set the last node and the next node of the bi-directional chain header node to point to themselves and more. In addition to some of these global variables for the garbage collection mechanism, there are some other variables that are more useful, in part:
Copy Code code as follows:
typedef struct _ZEND_GC_GLOBALS {
Zend_bool gc_enabled; /* Whether to open the garbage collection mechanism * *
Zend_bool gc_active; /* is in progress * *
Gc_root_buffer *buf; /* Pre-allocated buffer array defaults to 10000 (preallocated arrays of buffers) * *
Gc_root_buffer roots; /* The root node of the list (list of possible roots of cycles) * *
Gc_root_buffer *unused; /* No used buffer list (list of unused buffers) * *
Gc_root_buffer *first_unused; */* point to the first unused buffer node (pointer to-unused buffer).
Gc_root_buffer *last_unused; /* point to the last unused buffer node, which is the end of the tag (pointer to unused buffer) *.
Zval_gc_info *zval_to_free; /* Temporary list of zval variables to be released (Temporaryt list of zvals to free) * *
Zval_gc_info *free_list; /* Temporary variable, the start of the list to release * *
Zval_gc_info *next_to_free; /* TEMP variable, next variable position to be released * *
Zend_uint Gc_runs; /* GC Run Number of statistics * *
Zend_uint collected; /* Number of Garbage in GC * *
Omitted...
}
When we use a unset operation to erase the memory of this variable (probably just reference count minus one), deletes the item for the variable name from the hash table of the current symbol, after all the operations have been executed, and calls a destructor on the item deleted from the symbol table, the temporary variable invokes zval_dtor. A generic variable will invoke Zval_ptr_dtor.
Of course we can't find the unset function in PHP's function set, because it's a language structure. The corresponding intermediate code is Zend_unset, in the zend/zend_vm_execute.h file you can find the implementation related to it.
Zval_ptr_dtor is not a function, just a macro that looks a bit like a function. In the Zend/zend_variables.h file, this macro points to the function _zval_ptr_dtor. In ZEND/ZEND_EXECUTE_API.C 424 lines, the function-related code is as follows:
Copy Code code as follows:
Zend_api void _zval_ptr_dtor (Zval **zval_ptr zend_file_line_dc)/* {{* * *
{
#if debug_zend>=2
printf ("Reducing refcount for%x (%x):%d->%d\n", *zval_ptr, Zval_ptr, Z_refcount_pp (zval_ptr), z_refcount_pp (Zval_ PTR)-1);
#endif
Z_DELREF_PP (ZVAL_PTR);
if (z_refcount_pp (zval_ptr) = = 0) {
Tsrmls_fetch ();
if (*zval_ptr!= &eg (uninitialized_zval)) {
Gc_remove_zval_from_buffer (*ZVAL_PTR);
Zval_dtor (*ZVAL_PTR);
Efree_rel (*ZVAL_PTR);
}
} else {
Tsrmls_fetch ();
if (z_refcount_pp (zval_ptr) = = 1) {
Z_UNSET_ISREF_PP (ZVAL_PTR);
}
Gc_zval_check_possible_root (*ZVAL_PTR);
}
}
/* }}} */
From the code we can clearly see this zval process, the reference to the number of digital segments do the following two operations:
If the reference count for a variable is 1, that is, after minus one, the reference count is 0, and the variable is cleared directly. If the current variable is cached, you need to clear the cache if the reference count for the variable is greater than 1, that is, if the reference count is less than 0, then the variable is put into the garbage list. If a reference exists for the change, its reference is removed.
The action to put a variable into a garbage list is Gc_zval_check_possible_root, which is also a macro whose corresponding function is gc_zval_check_possible_root, but this function only performs garbage collection operations on the array and objects. For arrays and object variables, it calls the Gc_zval_possible_root function.
Copy Code code as follows:
Zend_api void Gc_zval_possible_root (Zval *zv tsrmls_dc)
{
if (unexpected Gc_g (free_list)!= NULL &&
Gc_zval_address (ZV)!= NULL &&
Gc_zval_get_color (ZV) = = Gc_black) &&
(Gc_zval_address (ZV) < Gc_g (BUF) | |
Gc_zval_address (ZV) >= gc_g (last_unused))) {
/* The given zval is a garbage it going to being deleted by
* Currently running GC * * *
Return
}
if (Zv->type = = Is_object) {
Gc_zobj_check_possible_root (ZV);
Return
}
Gc_bench_inc (Zval_possible_root);
if (Gc_zval_get_color (ZV)!= gc_purple) {
Gc_zval_set_purple (ZV);
if (! Gc_zval_address (Zv)) {
Gc_root_buffer *newroot = Gc_g (unused);
if (newroot) {
Gc_g (unused) = newroot->prev;
else if (Gc_g (first_unused)!= gc_g (last_unused)) {
Newroot = Gc_g (first_unused);
Gc_g (first_unused) + +;
} else {
if (! Gc_g (gc_enabled)) {
Gc_zval_set_black (ZV);
Return
}
zv->refcount__gc++;
Gc_collect_cycles (Tsrmls_c);
zv->refcount__gc--;
Newroot = Gc_g (unused);
if (!newroot) {
Return
}
Gc_zval_set_purple (ZV);
Gc_g (unused) = newroot->prev;
}
Newroot->next = Gc_g (Roots). Next;
Newroot->prev = &gc_g (Roots);
Gc_g (Roots). Next->prev = Newroot;
Gc_g (Roots). Next = Newroot;
Gc_zval_set_address (Zv, newroot);
Newroot->handle = 0;
Newroot->u.pz = Zv;
Gc_bench_inc (zval_buffered);
Gc_bench_inc (root_buf_length);
Gc_bench_peak (Root_buf_peak, root_buf_length);
}
}
}
As mentioned earlier, the Gc_zval_check_possible_root function only performs garbage collection operations on arrays and objects, whereas in the Gc_zval_possible_root function the variable for the object type is invoked Gc_zobj_check_ Possible_root macros. For other types of variables that can be used for garbage collection, the procedure is invoked as follows:
Check that the Zval node information has been put into the node buffer, and if it has been put into the node buffer, return directly, which optimizes its performance. It then handles the object node, returns directly, no longer performs the subsequent operation to determine whether the node has been marked purple, or if it is purple, is no longer added to the node buffer, this is to ensure that a node only performs an action added to the buffer once.
Marks the color of the node as purple, indicating that the node has been added to the buffer and that the next time you do not have to add
Finds the location of the new node and, if the buffer is full, performs a garbage collection operation.
Adds a new node to the doubly linked list where the buffer is located.
In the Gc_zval_possible_root function, when the buffer is full, the program calls the Gc_collect_cycles function to perform a garbage collection operation.
one of the most critical steps is:
Line No. 628 Here is step B of the algorithm in its official document, the algorithm uses depth-first search to find all possible roots, and when found, the reference count in each variable container is reduced by 1, to ensure that the same variable container is not reduced by two times "1", with a gray mark that has been reduced by 1.
Line No. 629 This is the algorithm step C, the algorithm once again to each root node using depth first search, check each variable container reference count. If the reference count is 0, the variable container is marked with white. If the reference count is greater than 0, restore the use of a depth-first search at this point, minus 1 for reference counts (that is, reference count plus 1), and then re-mark them with black.
The last step of line No. 630 algorithm D, the algorithm traverses the root buffer to remove the variable container root (Zval roots) from there, and checks to see if there is a variable container labeled white in the previous step. Each variable container that is marked with white is cleared. in [Gc_collect_cycles ()-> gc_collect_roots ()-> zval_collect_white ()] we can see that the node for the white mark is added to the global variable Zval_to_ Free list. This list is useful in subsequent operations.
The garbage collection mechanism of PHP is marked in four colors during execution.
Gc_white White means rubbish
Gc_purple Purple indicates that the buffer has been placed
Gc_grey Grey indicates that a refcount operation has been performed.
Gc_black Black is the default color, normal
The relevant markup and operation code are as follows:
Copy Code code as follows:
#define Gc_color 0x03
#define Gc_black 0x00
#define Gc_white 0x01
#define Gc_grey 0x02
#define Gc_purple 0x03
#define Gc_address (v) \
((gc_root_buffer*) ((zend_uintptr_t) (v)) & ~gc_color))
#define Gc_set_address (v, a) \
(v) = ((gc_root_buffer*) (((zend_uintptr_t) (v)) & Gc_color) | ((zend_uintptr_t) (a)))
#define Gc_get_color (v) \
(((zend_uintptr_t) (v)) & Gc_color)
#define Gc_set_color (v, c) \
(v) = ((gc_root_buffer*) (((zend_uintptr_t) (v)) & ~gc_color) | (c)))
#define Gc_set_black (v) \
(v) = ((gc_root_buffer*) ((zend_uintptr_t) (v)) & ~gc_color))
#define Gc_set_purple (v) \
(v) = ((gc_root_buffer*) ((zend_uintptr_t) (v)) | Gc_purple))
The above is a bit to mark the state of the way in the PHP source code to use a high frequency, such as memory management is useful to, this is a more efficient and save the program. However, when we do database design may not be used for fields this way, should be in a more intuitive, more readable way to achieve.