Reference and counting rules for PHP variables

Source: Internet
Author: User

Internal references and counts of variables

Inside the engine, a PHP variable is stored in the "ZVAL" structure, which contains the variable type and value information, which is stored inside the previous article variables: values and types have been introduced, this structure has two additional field information, one is "Is_ref" ( This field is IS_REF__GC in 5.3.2, which is a Boolean value used to identify if a variable is a reference, and this field allows the PHP engine to differentiate between generic variables and reference variables. In PHP code, you can create a reference variable with the & operation symbol, and the Is_ref field of the Zval within the reference variable is 1. There is another field in Zval refcount (this field is REFCOUNT__GC in 5.3.2), this field is a counter that indicates how many variable names point to the Zval container, and when this field is 0 o'clock, there is no variable pointing to this zval, Then Zval can be released, which is an optimization of memory within the engine. Consider the following code:

 
  

There are two variable $ A and $b in the code, a $ A is assigned to $b by ordinary assignment, so that the value of $b is equal to $ A, and the change to $b does not have any effect on $ A, so in this code, if $ A and $b correspond to two different zval, then obviously a waste of memory, PHP developers are not going to let that happen. So actually $ A and $b are pointing to the same zval. This zval type is string, the value is "Hello world", there is a $ A and $b two variables point to it, so it's refcount=2, because it is a normal assignment, so the Is_ref field is 0. This saves the memory overhead.

When a $ A = "Hello World" is executed, the information for the $a corresponding Zval is: A: (Refcount=1, is_ref=0) = "Hello World"

However, after executing the $b= $a, the $a corresponding zval information is: A: (refcount=2, is_ref=0) = "Hello World"

The following changes the previous code:

 
  

This assigns a $ A to $b by reference assignment.

When a $ A = "Hello World" is executed, the information for the $a corresponding Zval is: A: (Refcount=1, is_ref=0) = "Hello World"

However, after executing the $b=& $a, the $a corresponding zval information is: A: (refcount=2, is_ref=1) = "Hello World"

You can see that the Is_ref field is set to 1 so that the zval of $ A and $b is a reference. This gives us a basic understanding of the reference and counting of variables in the engine, and the separation of variables is described below.

Separating the variables copy on write

Consider the first paragraph of the preceding code, the normal way to assign $ A to $b, in the internal two variables or point to the same zval, if we change the value of $b to "new string", the value of $a variable is still "Hello world":

 
  

$a and $b clearly point to the same zval, why modify the $b, $a can still remain unchanged, this is the copy on write (copy) technology, simply, when re-assigned to the $b, will $b from the previous zval separation. After separation, the $a and $b point to different zval respectively.

A more famous application of write-time replication technology is in the UNIX class operating system kernel, when a process calls the fork function to generate a child process, the parent-child process has the same address space content, in the older version of the system, A child process copies the contents of the parent process's address space at fork, which can be costly for larger programs, and, more often, the process will call exec to execute another program directly in the child process after the fork. So the original spent a lot of time from the parent process to copy the address space before the new process address space is replaced, which is obviously a huge waste of resources, so in the later system, using the write-time replication technology, after fork, the child process address space is a simple point to the parent process address space, Only when the child process needs to write the contents of the address space, it will separate a copy (usually in memory page units) to the child process, so even if the child process immediately call the EXEC function is not necessary, because there is no need to copy the content from the parent process's address space, saving memory while increasing speed.

When $b from $ A point to the zval separation, Zval refcount will be reduced by 1, so that the previous 2 into 1, indicating that the zval has a variable point to it, is $ A. The $b variable points to a new zval, the new Zval RefCount is 1, the value is the string "new string", and the approximate process is as follows:

$a = "Hello World"//a: (refcount=1, is_ref=0) = ' Hello world ' $b = $a       //a,b: (refcount=2, is_ref=0) = "Hello World" $b = "n EW string "//a: (refcount=1, is_ref=0) =" Hello World "   B: (refcount=1, is_ref=0) =" New string "(split operation occurs)

This separation logic can be classified as: general assignment of a general variable a (isref=0) operation, if a is pointed to the zval count RefCount greater than 1, then a new zval needs to be reassigned, and the previous Zval count RefCount reduced by 1.

The above is a normal assignment, and if it is a reference assignment, let's look at the change process:

$a = "Hello World"//a: (refcount=1, is_ref=0) = ' Hello World ' $b = & $a       //a,b: (refcount=2, is_ref=1) = "Hello World" $b = "New String"//a,b: (refcount=2, is_ref=1) = "New String"

As you can see, assigning a zval to a reference type is not a separation operation, in fact, when we generate a reference variable, it is possible to have a separate operation, but the timing is somewhat different:

    1. In the case of normal assignment, the detach operation takes place at the step of $b= "new string", that is, when the variable is assigned a new value, the zval separation operation is performed

    2. In the case of reference assignment, the detach operation may occur at $b = & $a, that is, when a reference variable is generated

Situation 1 is not much explained, in case 2 emphasizes the possibility of separation, with the preceding code as an example, whether the separation and $ A is currently pointed to the Zval RefCount has a relationship, code $b = & $a, $a point to the Zval refcount=1, This time there is no need to detach the operation, but if refcount=2, then need to separate a zval out. For example, the following code:

 
  

When performing a reference assignment, $a point to the refcount=2 of the Zval, since $ A and $c point to the zval at the same time, so when the $b=& $a, a detach operation is generated, which generates a ref=1 zval with a count of 2. Because $ A, $b two variables point to the detached zval, the original Zval's refcount is reduced by 1, so eventually only $c points to a value of "Hello World", ref=0 Zval1, $a and $b point to a value of "Hello World", ref= 1 of Zval2. This allows us to modify the $c in the operation of the Zval1, the change to $ A and $b is in the operation of Zval2, so that it conforms to the characteristics of the reference.

This process is generally as follows:

$a = "Hello World";//a: (refcount=1, is_ref=0) = "Hello World" $c  = $a;       A,c: (refcount=2, is_ref=0) = "Hello World" $b = & $a;       C: (refcount=1, is_ref=0) = "Hello World" A, B: (refcount=2, is_ref=1) = "Hello World" (occurs detach operation) $b = "new string"; C: (refcount=1, is_ref=0) = "Hello World" A, B: (refcount=2, is_ref=1) = "New String"

Imagine what would happen if we didn't do this separation. If you do not separate, $a, $b, $c point to the same zval, the modification of $b will also affect the $c, which is obviously not in line with the PHP language features.

This separation logic can be expressed as: when a reference to a general variable A (isref=0) is assigned to another variable B, if the refcount of a is greater than 1, then a separate operation is required for a, after the separation of the zval isref equals 1,refcount equals 2

Some of the above knowledge and separation logic readers should be able to easily analyze other situations. For example, when assigning a reference to a reference variable a (isref=1) to a generic variable B, you need to reduce the refcount of the Zval pointed to by B by 1, then the zval of RefCount for the zval,a of B to a and 1, without any separation operation

These theories combine with code to make it easier to understand the process.

The role of unset

Unset () is not a function, but a language structure, which can be seen by looking at compilation generated opcode, unset corresponding to a opcode that is not a function call. So, what exactly did unset do? In the unset corresponding to the opcode handler can be seen in the relevant content, the main operation of the current symbol table to remove the symbols in the parameters, such as in the global code to execute the unset ($a), then the global symbol table will be removed a symbol. The global symbol table is a hash table, which creates a destructor for the items in a table, and when we delete a from the symbol table, the destructor is called for the item that the symbol a points to (here is the pointer to Zval). The main function of this destructor is to reduce the refcount of zval of a corresponding to 1, if RefCount becomes 0, then release this zval. So when we call unset, it is not always possible to release the memory space of the variable, only if the variable corresponding to the Zval no other variable points to it, it will release zval, otherwise it is only to refcount minus 1 operation.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.