Garbage collection mechanism for PHP5.3

Source: Internet
Author: User
Tags php source code
    1. struct _zval_struct {
    2. /* Variable Information */
    3. Zvalue_value value; /* Value */
    4. Zend_uint refcount__gc;
    5. Zend_uchar type; /* Active type */
    6. Zend_uchar is_ref__gc;
    7. };
Copy Code

The reference count field RefCount and whether reference fields Is_ref are added __gc for the new garbage collection mechanism, compared to the previous version of PHP5.3. In the source code style of PHP, a large number of macros is a very distinct feature. These macros are equivalent to an interface layer that masks some of the underlying implementations below the interface layer, such as the Alloc_zval macro, which, before PHP5.3, is directly calling PHP's memory management allocation function Emalloc allocating memory, and the allocated memory size is determined by the size of the variable. After introducing the garbage collection mechanism, the ALLOC_ZVAL macro directly adopts the new garbage collection unit structure, the allocated size is the same, all is the memory size of the ZVAL_GC_INFO structure, and after allocating the memory, initialize the garbage collection mechanism of the structure.

    1. /* The following macroses override macroses from Zend_alloc.h */
    2. #undef Alloc_zval
    3. #define ALLOC_ZVAL (z) \
    4. do {\
    5. (z) = (zval*) emalloc (sizeof (zval_gc_info)); \
    6. Gc_zval_init (z); \
    7. } while (0)
Copy Code

The zend_gc.h file is referenced in 749 lines of zend.h: #include "zend_gc.h" to replace macros that overwrite alloc_zval in the zend_alloc.h file referenced in 237 rows in a new macro, The key change is the change in the size and allocation of allocated memory, the content of the garbage collection mechanism was added to the previous pure memory allocation, and all the content is included in the ZVAL_GC_INFO structure:

    1. typedef struct _ZVAL_GC_INFO {
    2. Zval Z;
    3. Union {
    4. Gc_root_buffer *buffered;
    5. struct _zval_gc_info *next;
    6. } u;
    7. } Zval_gc_info;
Copy Code

For any variable stored in a Zval container, a zval structure is allocated that ensures that it is aligned with the start of memory allocated by the Zval variable, which can be used as zval when casting the Zval_gc_info type pointer. There is a union following the Zval field: U. U includes the buffered field of the Gc_root_buffer structure and the next field of the Zval_gc_info structure. One of these two fields is the root node that represents the garbage collection mechanism cache, one is the next node of the Zval_gc_info list, and the garbage collection mechanism caches nodes, whether as root nodes or list nodes, can be represented here. Alloc_zval will call Gc_zval_init to initialize the zval_gc_info that replaces zval when memory is allocated, which sets the Zval_gc_info field of the member U in buffered to null. This field will have a value only if it is placed in the garbage-collected buffer, otherwise it will always be null. Since all the variables in PHP exist in the form of zval variables, the zval_gc_info replaces the Zval, which successfully integrates the garbage collection mechanism in the original system. PHP's garbage collection mechanism is turned on by default in PHP5.3, but we can set the configuration file directly to disabled, and its corresponding configuration field is: ZEND.ENABLE_GC. There is no this field by default in the php.ini file, and if we need to disable this feature, add zend.enable_gc=0 or Zend.enable_gc=off to php.ini. In addition to modifying the php.ini configuration zend.enable_gc, you can also turn on/off the garbage collection mechanism by calling the Gc_enable ()/gc_disable () function. The effect of invoking these functions is the same as modifying the configuration item to turn the garbage collection on or off. In addition to these two functions PHP provides the gc_collect_cycles () function to enforce cycle recycling when the root buffer is not yet full. and the garbage collection mechanism is open in the PHP source code there are some related operations and fields. The following code is in the ZEND.C file:

    1. Static ZEND_INI_MH (onupdategcenabled)/* {{{* * *
    2. {
    3. Onupdatebool (Entry, New_value, New_value_length, Mh_arg1, Mh_arg2, MH_ARG3, stage tsrmls_cc);
    4. if (Gc_g (gc_enabled)) {
    5. Gc_init (Tsrmls_c);
    6. }
    7. return SUCCESS;
    8. }
    9. /* }}} */
    10. Zend_ini_begin ()
    11. Zend_ini_entry ("error_reporting", NULL, Zend_ini_all, onupdateerrorreporting)
    12. Std_zend_ini_boolean ("zend.enable_gc", "1", Zend_ini_all, onupdategcenabled, gc_enabled, Zend_gc_globals, gc_globals )
    13. #ifdef Zend_multibyte
    14. Std_zend_ini_boolean ("Detect_unicode", "1", Zend_ini_all, Onupdatebool, Detect_unicode, Zend_compiler_globals, Compiler_globals)
    15. #endif
    16. Zend_ini_end ()
Copy Code

The

zend.enable_gc corresponding action function is ZEND_INI_MH (onupdategcenabled), and if the garbage collection mechanism is turned on, that is, Gc_g (gc_enabled) is true, the gc_ is called The INIT function performs the initialization of the garbage collection mechanism. Gc_init function in ZEND/ZEND_GC.C 121 line, this function will determine whether to turn on garbage collection mechanism, if turned on, then initialize the entire mechanism, that is, directly call malloc to allocate 10,000 gc_root_buffer memory space for the entire cache list. Here the 10000 is hardcoded in the code, to the macro gc_root_buffer_max_entries exist, if you need to modify this value, you need to modify the source, recompile PHP. The Gc_init function calls the Gc_reset function after pre-allocating memory to reset some of the global variables used by the entire mechanism, such as setting the number of GC runs (gc_runs) and the number of garbage in GC (collected) to 0, Sets the last node and the next node of the doubly-linked header node to point to itself, and so on. In addition to some of the global variables used in the garbage collection mechanism, there are some other variables that are used more frequently, some of which are described below:

    1. typedef struct _ZEND_GC_GLOBALS {
    2. Zend_bool gc_enabled; /* Whether to turn on garbage collection mechanism */
    3. Zend_bool gc_active; /* is in progress */
    4. Gc_root_buffer *buf; /* Pre-allocated buffer array, default = 10000 (preallocated arrays of buffers) */
    5. Gc_root_buffer roots; /* Root node of the list (list of possible roots of cycles) */
    6. Gc_root_buffer *unused; /* List of buffers not used (list of unused buffers) */
    7. Gc_root_buffer *first_unused; /* points to the first unused buffer node (pointer to unused buffer) */
    8. Gc_root_buffer *last_unused; /* points to the last unused buffer node, here for Mark End (pointer to unused buffer) */
    9. Zval_gc_info *zval_to_free; /* Temporary list of zval variables to be freed (Temporaryt list of zvals to free) */
    10. Zval_gc_info *free_list; /* Temporary variable that needs to be disposed at the beginning of the list */
    11. Zval_gc_info *next_to_free; /* TEMP variable, next variable location to be freed */
    12. Zend_uint Gc_runs; /* GC Run count Count */
    13. Zend_uint collected; /* Number of garbage in GC */
    14. Omitted...
    15. }
Copy Code

When we use a unset operation to clear the memory of this variable (possibly just the reference count minus one), the item corresponding to the variable name is removed from the hash table of the current symbol, after all operations have been performed, and a destructor is called on the item deleted from the symbol table, and the temporary variable calls Zval_ Dtor, the generic variable will call Zval_ptr_dtor. Of course we cannot find the unset function in the PHP function set because it is a language structure. The corresponding intermediate code is Zend_unset, and in the Zend/zend_vm_execute.h file you can find the implementation associated with it. Zval_ptr_dtor is not a function, just a macro that looks a bit like a function. In the Zend/zend_variables.h file, this macro points to the function _zval_ptr_dtor. In the ZEND/ZEND_EXECUTE_API.C 424 line, the function-related code is as follows:

    1. Zend_api void _zval_ptr_dtor (Zval **zval_ptr zend_file_line_dc)/* {{* * *
    2. {
    3. #if debug_zend>=2
    4. printf ("Reducing refcount for%x (%x):%d->%d\n", *zval_ptr, Zval_ptr, Z_refcount_pp (zval_ptr), z_refcount_pp (Zval_ PTR)-1);
    5. #endif
    6. Z_DELREF_PP (ZVAL_PTR);
    7. if (z_refcount_pp (zval_ptr) = = 0) {
    8. Tsrmls_fetch ();
    9. if (*zval_ptr! = &eg (uninitialized_zval)) {
    10. Gc_remove_zval_from_buffer (*ZVAL_PTR);
    11. Zval_dtor (*ZVAL_PTR);
    12. Efree_rel (*ZVAL_PTR);
    13. }
    14. } else {
    15. Tsrmls_fetch ();
    16. if (z_refcount_pp (zval_ptr) = = 1) {
    17. Z_UNSET_ISREF_PP (ZVAL_PTR);
    18. }
    19. Gc_zval_check_possible_root (*ZVAL_PTR);
    20. }
    21. }
    22. /* }}} */
Copy Code

From the code we can clearly see this zval of the destruction process, about the reference counting field did the following two operations: if the reference count of a variable is 1, that is, minus one after the reference count is 0, clear the variable directly. If the current variable is cached, you need to clear the cache if the reference count of the variable is greater than 1, that is, if the reference count is greater than 0 after minus one, the variable is placed in the garbage list. If the change has a reference, its reference is removed.

The action of putting a variable in the garbage list is Gc_zval_check_possible_root, which is also a macro whose corresponding function is gc_zval_check_possible_root, but this function only performs garbage collection operations on arrays and objects. For arrays and object variables, it calls the Gc_zval_possible_root function.

  1. Zend_api void Gc_zval_possible_root (Zval *zv tsrmls_dc)
  2. {
  3. if (Unexpected (gc_g (free_list) = NULL &&
  4. Gc_zval_address (ZV)! = NULL &&
  5. Gc_zval_get_color (ZV) = = Gc_black) &&
  6. (Gc_zval_address (ZV) < Gc_g (BUF) | |
  7. Gc_zval_address (ZV) >= gc_g (last_unused))) {
  8. /* The given Zval is a garbage that's going to being deleted by
  9. * Currently running GC * *
  10. Return
  11. }
  12. if (Zv->type = = Is_object) {
  13. Gc_zobj_check_possible_root (ZV);
  14. Return
  15. }
  16. Gc_bench_inc (Zval_possible_root);
  17. if (Gc_zval_get_color (ZV)! = gc_purple) {
  18. Gc_zval_set_purple (ZV);
  19. if (! Gc_zval_address (Zv)) {
  20. Gc_root_buffer *newroot = Gc_g (unused);
  21. if (newroot) {
  22. Gc_g (unused) = newroot->prev;
  23. } else if (Gc_g (first_unused)! = Gc_g (last_unused)) {
  24. Newroot = Gc_g (first_unused);
  25. Gc_g (first_unused) + +;
  26. } else {
  27. if (! Gc_g (gc_enabled)) {
  28. Gc_zval_set_black (ZV);
  29. Return
  30. }
  31. zv->refcount__gc++;
  32. Gc_collect_cycles (Tsrmls_c);
  33. zv->refcount__gc--;
  34. Newroot = Gc_g (unused);
  35. if (!newroot) {
  36. Return
  37. }
  38. Gc_zval_set_purple (ZV);
  39. Gc_g (unused) = newroot->prev;
  40. }
  41. Newroot->next = Gc_g (Roots). Next;
  42. Newroot->prev = &gc_g (Roots);
  43. Gc_g (Roots). Next->prev = Newroot;
  44. Gc_g (Roots). Next = Newroot;
  45. Gc_zval_set_address (Zv, newroot);
  46. Newroot->handle = 0;
  47. Newroot->u.pz = Zv;
  48. Gc_bench_inc (zval_buffered);
  49. Gc_bench_inc (root_buf_length);
  50. Gc_bench_peak (Root_buf_peak, root_buf_length);
  51. }
  52. }
  53. }
Copy Code

The

says earlier that the Gc_zval_check_possible_root function only performs garbage collection operations on arrays and objects, whereas in the Gc_zval_possible_root function, variables for object types are called Gc_zobj_check_ Possible_root macros. For other types of variables that can be used for garbage collection mechanisms, the invocation process is as follows: Check that the Zval node information has been put into the node buffer, and if it has been put into the node buffer, it is returned directly, which optimizes its performance. It then processes the object node, returns directly, and no longer performs a subsequent operation to determine whether the node has been marked purple, and if it is purple, it is no longer added to the node buffer, where a node is guaranteed to perform only one operation added to the buffer.

The

marks the color of the node as purple, indicating that the node has been added to the buffer and will not be added next time. The location of the new node is found, and if the buffer is full, a garbage collection operation is performed. Adds a new node to the doubly linked list where the buffer is located. In the Gc_zval_possible_root function, when the buffer is full, the program calls the Gc_collect_cycles function to perform a garbage collection operation. The most critical steps are: line No. 628 here for the algorithm in its official document, step B, the algorithm uses a depth-first search to find all possible roots, find the reference count in each variable container minus 1, to ensure that the same variable container is not minus two "1", the gray mark has been reduced by 1. Line No. 629 This is the algorithm's step C, the algorithm once again uses the depth-first search for each root node, checking the reference count of each variable container. If the reference count is 0, the variable container is marked with white. If the number of references is greater than 0, the operation (that is, the reference count plus 1) is restored at this point using the depth-first search and the reference count minus 1, and then re-marked with black. In the last step of the No. 630 line of the algorithm, D, the algorithm traverses the root buffer to remove the variable container root (Zval roots) from there, and checks whether there is a variable container that was marked white in the previous step. Each variable container that is marked with white is cleared. in [Gc_collect_cycles (), Gc_collect_roots (), Zval_collect_white (), we can see that the nodes that are marked with white are added to the global variable Zval_to_ The free list. This list is useful in subsequent operations. PHP's garbage collection mechanism marks the state in four colors during execution. Gc_white white means garbage gc_purple purple indicates that the buffer has been put in Gc_grey Gray indicates that a refcount has been done minus one operation Gc_black Black is the default color, the normal related tags and the operation code are as follows:

    1. #define Gc_color 0x03
    2. #define Gc_black 0x00
    3. #define Gc_white 0x01
    4. #define Gc_grey 0x02
    5. #define Gc_purple 0x03
    6. #define Gc_address (v) \
    7. ((gc_root_buffer*) ((((zend_uintptr_t) (v)) & ~gc_color))
    8. #define Gc_set_address (v, a) \
    9. (v) = ((gc_root_buffer*) (((zend_uintptr_t) (v)) & Gc_color) | ((zend_uintptr_t) (a))))
    10. #define Gc_get_color (v) \
    11. (((zend_uintptr_t) (v)) & Gc_color)
    12. #define Gc_set_color (v, c) \
    13. (v) = ((gc_root_buffer*) (((zend_uintptr_t) (v)) & ~gc_color) | (c)))
    14. #define Gc_set_black (v) \
    15. (v) = ((gc_root_buffer*) (((zend_uintptr_t) (v)) & ~gc_color))
    16. #define Gc_set_purple (v) \
    17. (v) = ((gc_root_buffer*) (((zend_uintptr_t) (v)) | Gc_purple))
Copy Code

This is a bit to mark the state of the way in the PHP source code to use a high frequency, such as memory management, etc. are useful, this is a more efficient and saving solution. However, we may not use this method for fields when we do database design, and should be implemented in a more intuitive and more readable manner.

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.