Garbage collection mechanism for PHP5.3

Source: Internet
Author: User
Garbage collection mechanism for PHP5.3

  1. Struct _ zval_struct {
  2. /* Variable information */
  3. Zvalue_value value;/* value */
  4. Zend_uint refcount _ gc;
  5. Zend_uchar type;/* active type */
  6. Zend_uchar is_ref _ gc;
  7. };

Compared with versions earlier than PHP5.3, the reference counting field refcount and the referenced field is_ref both add _ gc to the end of the field for the new garbage collection mechanism. In the source code style of PHP, a large number of macros are very distinctive. These macros are equivalent to an interface layer, which shields some underlying implementations below the interface layer, such as ALLOC_ZVAL macros. before PHP5.3, this macro directly calls the PHP memory management allocation function emalloc to allocate memory, the size of the allocated memory is determined by the variable type. After the garbage collection mechanism is introduced, the ALLOC_ZVAL macro uses the new garbage collection unit structure directly. the allocated size is the same, all of which are the memory size occupied by the zval_gc_info struct. after the memory is allocated, initializes the garbage collection mechanism of this struct.

  1. /* The following macroses override macroses from zend_alloc.h */
  2. # Undef ALLOC_ZVAL
  3. # Define ALLOC_ZVAL (z )\
  4. Do {\
  5. (Z) = (zval *) emalloc (sizeof (zval_gc_info ));\
  6. GC_ZVAL_INIT (z );\
  7. } While (0)

The zend_gc.h file is in zend. the first row of h is referenced: # include "zend_gc.h" to replace macro ALLOC_ZVAL and other macros in the zend_alloc.h file referenced in the second row in the new macro, the key change is the change of the allocated memory size and allocated content. in the previous pure memory allocation, the content of the garbage collection mechanism was added. all the content is included in the zval_gc_info structure:

  1. Typedef struct _ zval_gc_info {
  2. Zval z;
  3. Union {
  4. Gc_root_buffer * buffered;
  5. Struct _ zval_gc_info * next;
  6. } U;
  7. } Zval_gc_info;

A zval structure is assigned to any variable stored in the zval container. this structure ensures that it is aligned with the memory allocated with the zval variable, so that the zval_gc_info type pointer is forcibly converted, it can be used as zval. There is a consortium behind the zval field: u. U includes the buffered field in the gc_root_buffer structure and the next field in the zval_gc_info structure. One of these two fields indicates the root node cached by the garbage collection mechanism, and the other is the next node in the zval_gc_info list. the node cached by the garbage collection mechanism is either the root node or the list node, can be reflected here. ALLOC_ZVAL will call GC_ZVAL_INIT to initialize zval_gc_info, which replaces zval. it will set the buffered field of member u in zval_gc_info to NULL, this field has a value only when it is put into the garbage collection buffer; otherwise it will always be NULL. All the variables in PHP exist in the form of zval variables. here, zval is replaced with zval_gc_info to achieve the integration of the garbage collection mechanism in the original system. PHP's garbage collection mechanism is enabled by default in PHP5.3, but we can directly disable it through the configuration file. the corresponding configuration field is zend. enable_gc. This field is not found in the php. ini file by default. to disable this function, add zend. enable_gc = 0 or zend. enable_gc = off to php. ini. In addition to modifying php. ini to configure zend. enable_gc, you can call the gc_enable ()/gc_disable () function to enable/disable the garbage collection mechanism. The calling effect of these functions is the same as modifying the configuration item to enable or disable the garbage collection mechanism. In addition to these two functions, PHP provides the gc_collect_cycles () function to forcibly execute the cycle recycle when the root buffer is not full. Whether the garbage collection mechanism is enabled has some related operations and fields in the PHP source code. The zend. c file contains the following code:

  1. Static ZEND_INI_MH (OnUpdateGCEnabled )/*{{{*/
  2. {
  3. OnUpdateBool (entry, new_value, new_value_length, mh_arg1, mh_arg2, mh_arg3, stage TSRMLS_CC );
  4. If (GC_G (gc_enabled )){
  5. Gc_init (TSRMLS_C );
  6. }
  7. Return SUCCESS;
  8. }
  9. /*}}}*/
  10. ZEND_INI_BEGIN ()
  11. ZEND_INI_ENTRY ("error_reporting", NULL, ZEND_INI_ALL, OnUpdateErrorReporting)
  12. STD_ZEND_INI_BOOLEAN ("zend. enable_gc", "1", ZEND_INI_ALL, OnUpdateGCEnabled, gc_enabled, zend_gc_globals, gc_globals)
  13. # Ifdef ZEND_MULTIBYTE
  14. STD_ZEND_INI_BOOLEAN ("detect_unicode", "1", ZEND_INI_ALL, OnUpdateBool, detect_unicode, zend_compiler_globals, compiler_globals)
  15. # Endif
  16. ZEND_INI_END ()

The operation function corresponding to zend. enable_gc is ZEND_INI_MH (OnUpdateGCEnabled). If the garbage collection mechanism is enabled, that is, GC_G (gc_enabled) is true, the gc_init function is called to initialize the garbage collection mechanism. The gc_init function is in row zend/zend_gc.c 121. this function determines whether the garbage collection mechanism is enabled. if so, the entire mechanism is initialized, that is, you can directly call malloc to allocate 10000 gc_root_buffer memory space to the entire cache list. Here, 10000 is hard-coded in the code. the macro GC_ROOT_BUFFER_MAX_ENTRIES exists. if you need to modify this value, you need to modify the source code and re-compile PHP. The gc_init function calls the gc_reset function after pre-allocating memory to reset some global variables used by the entire mechanism. for example, set the gc running count (gc_runs) and the gc garbage count (collected) to 0, set the previous node and next node of the two-way linked list header node to point to itself. In addition to the global variables used for the garbage collection mechanism, there are other variables that use more, which are described as follows:

  1. Typedef struct _ zend_gc_globals {
  2. Zend_bool gc_enabled;/* whether to enable the garbage collection mechanism */
  3. Zend_bool gc_active;/* in progress */
  4. Gc_root_buffer * buf;/* pre-allocated buffer array. the default value is 10000 (preallocated arrays of buffers )*/
  5. Gc_root_buffer roots;/* list of root nodes (list of possible roots of cycles )*/
  6. Gc_root_buffer * unused;/* list of unused buffers )*/
  7. Gc_root_buffer * first_unused;/* points to the first unused buffer node (pointer to first unused buffer )*/
  8. Gc_root_buffer * last_unused;/* points to the last unused buffer node, which indicates the end of use (pointer to last unused buffer )*/
  9. Zval_gc_info * zval_to_free;/* temporary list of zval variables to be released (temporaryt list of zvals to free )*/
  10. Zval_gc_info * free_list;/* temporary variable, starting with the list to be released */
  11. Zval_gc_info * next_to_free;/* temporary variable, next location of the variable to be released */
  12. Zend_uint gc_runs;/* gc running count */
  13. Zend_uint collected;/* Number of garbage in gc */
  14. // Omit...
  15. }

When we use an unset operation to clear the memory occupied by this variable (it may only reduce the reference count by one), the corresponding item of the variable name will be deleted from the hash table of the current symbol, after all operations are completed, a destructor is called for the items deleted from the symbol table. the temporary variable will call zval_dtor, and the general variable will call zval_ptr_dtor. Of course, we cannot find the unset function in the PHP function set, because it is a language structure. The intermediate code is ZEND_UNSET. you can find the implementation related to it in the Zend/zend_vm_execute.h file. Zval_ptr_dtor is not a function, but a macro that looks a little like a function. In the Zend/zend_variables.h file, this macro points to the function _ zval_ptr_dtor. In row 424 of Zend/zend_execute_API.c, the function code is as follows:

  1. ZEND_API void _ zval_ptr_dtor (zval ** zval_ptr ZEND_FILE_LINE_DC )/*{{{*/
  2. {
  3. # If DEBUG_ZEND> = 2
  4. Printf ("Cing refcount for % x (% x): % d-> % d \ n", * zval_ptr, zval_ptr, Z_REFCOUNT_PP (zval_ptr), Z_REFCOUNT_PP (zval_ptr) -1 );
  5. # Endif
  6. Z_DELREF_PP (zval_ptr );
  7. If (Z_REFCOUNT_PP (zval_ptr) = 0 ){
  8. TSRMLS_FETCH ();
  9. If (* zval_ptr! = & EG (uninitialized_zval )){
  10. GC_REMOVE_ZVAL_FROM_BUFFER (* zval_ptr );
  11. Zval_dtor (* zval_ptr );
  12. Efree_rel (* zval_ptr );
  13. }
  14. } Else {
  15. TSRMLS_FETCH ();
  16. If (Z_REFCOUNT_PP (zval_ptr) = 1 ){
  17. Z_UNSET_ISREF_PP (zval_ptr );
  18. }
  19. GC_ZVAL_CHECK_POSSIBLE_ROOT (* zval_ptr );
  20. }
  21. }
  22. /*}}}*/

From the code, we can clearly see the zval destructor. we have done the following two operations on the reference count field: if the reference count of the variable is 1, that is, the reference count is 0 after the value is reduced, and the variable is cleared directly. If the current variable is cached, you need to clear the cache. if the reference count of the variable is greater than 1, that is, if the reference count is greater than 0, the variable is put into the garbage list. If a change has a reference, the reference is removed.

The GC_ZVAL_CHECK_POSSIBLE_ROOT macro is used to add variables to the garbage collection list. the corresponding function gc_zval_check_possible_root is used to recycle arrays and objects. For arrays and object variables, it calls the gc_zval_possible_root function.

  1. ZEND_API void gc_zval_possible_root (zval * zv TSRMLS_DC)
  2. {
  3. If (UNEXPECTED (GC_G (free_list )! = NULL &&
  4. GC_ZVAL_ADDRESS (zv )! = NULL &&
  5. GC_ZVAL_GET_COLOR (zv) = GC_BLACK )&&
  6. (GC_ZVAL_ADDRESS (zv) <GC_G (buf) |
  7. GC_ZVAL_ADDRESS (zv)> = GC_G (last_unused ))){
  8. /* The given zval is a garbage that is going to be deleted
  9. * Currently running GC */
  10. Return;
  11. }
  12. If (zv-> type = IS_OBJECT ){
  13. GC_ZOBJ_CHECK_POSSIBLE_ROOT (zv );
  14. Return;
  15. }
  16. GC_BENCH_INC (zval_possible_root );
  17. If (GC_ZVAL_GET_COLOR (zv )! = GC_PURPLE ){
  18. GC_ZVAL_SET_PURPLE (zv );
  19. If (! GC_ZVAL_ADDRESS (zv )){
  20. Gc_root_buffer * newRoot = GC_G (unused );
  21. If (newRoot ){
  22. GC_G (unused) = newRoot-> prev;
  23. } Else if (GC_G (first_unused )! = GC_G (last_unused )){
  24. NewRoot = GC_G (first_unused );
  25. GC_G (first_unused) ++;
  26. } Else {
  27. If (! GC_G (gc_enabled )){
  28. GC_ZVAL_SET_BLACK (zv );
  29. Return;
  30. }
  31. Zv-> refcount _ gc ++;
  32. Gc_collect_cycles (TSRMLS_C );
  33. Zv-> refcount _ gc --;
  34. NewRoot = GC_G (unused );
  35. If (! NewRoot ){
  36. Return;
  37. }
  38. GC_ZVAL_SET_PURPLE (zv );
  39. GC_G (unused) = newRoot-> prev;
  40. }
  41. NewRoot-> next = GC_G (roots). next;
  42. NewRoot-> prev = & GC_G (roots );
  43. GC_G (roots). next-> prev = newRoot;
  44. GC_G (roots). next = newRoot;
  45. GC_ZVAL_SET_ADDRESS (zv, newRoot );
  46. NewRoot-> handle = 0;
  47. NewRoot-> u. pz = zv;
  48. GC_BENCH_INC (zval_buffered );
  49. GC_BENCH_INC (root_buf_length );
  50. GC_BENCH_PEAK (root_buf_peak, root_buf_length );
  51. }
  52. }
  53. }

As mentioned above, the gc_zval_check_possible_root function only recycles arrays and objects. However, in the gc_zval_possible_root function, variables of the object type call the GC_ZOBJ_CHECK_POSSIBLE_ROOT macro. For other variable types that can be used for garbage collection, the calling process is as follows: Check whether zval node information has been put into the node buffer, and if it has already been put into the node buffer, return directly, this can optimize its performance. Then, process the object node and return it directly. no longer perform the subsequent operations to determine whether the node has been marked as purple. if it is purple, it is not added to the node buffer, this ensures that a node is added to the buffer only once.

Mark the node color as purple, indicating that the node has been added to the buffer zone. you do not need to add it again next time. Find the location of the new node. if the buffer zone is full, perform the garbage collection operation. Add the new node to the bidirectional linked list of the buffer. In the gc_zval_possible_root function, when the buffer is full, the program calls the gc_collect_cycles function to perform garbage collection. The most critical step here is Step B of the algorithm in its official documentation. The algorithm uses deep-first search to find all possible roots, after finding it, subtract 1 from the reference count in each variable container. to ensure that "1" is not subtracted twice from the same variable container, use the gray mark that the reference count has been reduced by 1. Row 3: This is step C of the algorithm. The algorithm uses deep-first searches for each root node again to check the reference count of each variable container. If the reference count is 0, the variable container is marked in white. If the number of references is greater than 0, restore the operation that uses deep preference search at this point and subtract the reference count by 1 (that is, the reference count plus 1), and then re-mark them with black. The last step of the 630th-line algorithm is D. The algorithm traverses the root buffer to delete the variable container root (zval roots) from there, and checks whether there are variable containers marked White in the previous step. Each variable container marked with white is cleared. In [gc_collect_cycles ()-> gc_collect_roots ()-> zval_collect_white ()], we can see that the nodes marked with white will be added to the global variable zval_to_free list. This list is used in subsequent operations. The garbage collection mechanism of PHP is marked in four colors during execution. GC_WHITE indicates GC_PURPLE purple indicates that GC_GREY gray indicates that a refcount minus operation has been performed. GC_BLACK is the default color. the normal mark and operation code are as follows:

  1. # Define GC_COLOR 0x03
  2. # Define GC_BLACK 0x00
  3. # Define GC_WHITE 0x01
  4. # Define GC_GREY 0x02
  5. # Define GC_PURPLE 0x03
  6. # Define GC_ADDRESS (v )\
  7. (Gc_root_buffer *) (zend_uintptr_t) (v ))&~ GC_COLOR ))
  8. # Define GC_SET_ADDRESS (v, )\
  9. (V) = (gc_root_buffer *) (zend_uintptr_t) (v) & GC_COLOR) | (zend_uintptr_t) ())))
  10. # Define GC_GET_COLOR (v )\
  11. (Zend_uintptr_t) (v) & GC_COLOR)
  12. # Define GC_SET_COLOR (v, c )\
  13. (V) = (gc_root_buffer *) (zend_uintptr_t) (v ))&~ GC_COLOR) | (c )))
  14. # Define GC_SET_BLACK (v )\
  15. (V) = (gc_root_buffer *) (zend_uintptr_t) (v ))&~ GC_COLOR ))
  16. # Define GC_SET_PURPLE (v )\
  17. (V) = (gc_root_buffer *) (zend_uintptr_t) (v) | GC_PURPLE ))

The above bit-based flag state is frequently used in PHP source code, such as memory management, which is a more efficient and cost-effective solution. However, we may not use this method for fields during database design. it should be implemented in a more intuitive and readable way.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.