47-Reference count vs. write-time replication

Source: Internet
Author: User

47-Reference count vs. write-time replication

For PHP, which needs to handle multiple requests at the same time, the application and release of memory should be cautious, accidentally will become a big mistake. On the other hand, in addition to the secure application and release of memory, should also be done to minimize the use of memory, because it may have to process thousands of requests per second, in order to improve the overall performance of the system, each operation should only use a minimum of memory, for unnecessary duplication of the same data should be exempt. Let's take a look at the following PHP code:

<?php$a = ‘Hello NowaMagic!‘;$b = $a;unset($a); ?>

After the first statement is executed, PHP creates a variable of $A and applies 12B of memory for it to hold the "Hello World" string (and finally the null character, you know). Immediately after assigning $ A to $B, and releasing $A;

For PHP, if each time the variable assignment to perform a memory copy, it is necessary to apply for 12B of memory to hold the duplicate data, of course, in order to replicate memory, but also need the CPU to perform some calculations, which will certainly aggravate the CPU load. When the third sentence is executed,$A is released, and the idea that we just made suddenly becomes so comical that this assignment looks superfluous. If you already know $a No, then we directly let $B with $A memory is not OK, but also assigned to do? If you think 12B is nothing, then imagine if $A is a 10M file content, or 20M, is not our computer resource consumption is a bit wronged?

Don't worry, PHP is smart!

As mentioned earlier, the name and value of the PHP variable are stored in two different places in the kernel, and the value is saved by a zval structure that has nothing to do with the name, and the variable's name A is stored in the symbol table, which is linked by a pointer. In our example above,$A is a string, we add it to the symbol table by Zend_hash_add, and then assign it to $B, both with the same content! If the two point to exactly the same content, do we have any optimization measures?

Now we check $A and $b two variables, their values point to "Hello nowamagic!" The position of this string in memory. But on the third line: unset ($a); This statement frees $A. In this case, the unset function does not know that the value of $A is used by $B at the same time, so if it frees memory directly, it causes the value of $b to be emptied, resulting in a logic error that could even cause the system to crash.

Oh, in fact, you understand that PHP will not let the above problems occur! Recalling the four members of Zval value, type, IS_REF__GC, REFCOUNT__GC, we are already familiar with value and type, and now it is time for the latter two members to play a powerful role, here we mainly explain the refcount__gc this member. When a variable is created for the first time, the value of its corresponding zval struct's REFCOUNT__GC member is initialized to 1 for the simple reason that only this variable is used by itself. But when you assign this variable to another variable, the REFCOUNT__GC property adds 1 to 2, because now there are two variables in the ZVAL structure!

The code described above into the kernel is generally as follows:

zval *helloval;MAKE_STD_ZVAL(helloval);ZVAL_STRING(helloval, "Hello World", 1);zend_hash_add(EG(active_symbol_table), "a", sizeof("a"),&helloval, sizeof(zval*), NULL);ZVAL_ADDREF(helloval); //这句很特殊,我们显示的增加了helloval结构体的refcountzend_hash_add(EG(active_symbol_table), "b", sizeof("b"),&helloval, sizeof(zval*), NULL);    

This time when we are again using unset to delete $A, it removes the information from the symbol table of $A and then cleans up its value section, when it finds that the value of $a corresponds to the RefCount value of the zval structure, which is 2, That is, there is another variable together with this zval, so unset just take this zval refcount minus 1!

Write-time replication mechanism

Reference counting is definitely an awesome mode of saving memory! But what if we modify the value of $B and still need to continue using $a?

$a = 1;$b = $a;$b += 5;

From the code logic, we want the statement to be executed after $A is still 1, while $B needs to become 6. We know that after the second sentence is complete, the kernel achieves memory savings by letting $A and $B share a zval structure, but now the third sentence comes, how should the change in $b be implemented in the kernel?

The answer is very simple, the kernel first looks at the REFCOUNT__GC property, and if it is greater than 1, it copies a new exclusive and $b Zval from the original zval structure and changes its value.

 zval *get_var_and_separate (char *varname, int varname_len tsrmls_dc) {zval **varval, *varcopy; if (Zend_hash_find (EG (active_symbol_table), varname, Varname_len + 1, (void**) &varval) = = FAILURE) {/* If the symbol This variable is not found in the table and is directly return */return NULL; } if ((*varval)->refcount < 2) {//If the zval part of this variable is refcount less than 2, it means that there is no other variable in use, return return *varval ; }/* Otherwise, copy the value of zval* */Make_std_zval (varcopy); Varcopy = *varval; /* Duplicate Any allocated structures within the zval* */Zval_copy_ctor (varcopy); /* Remove the original variable from the symbol table * This would decrease the refcount of varval in the process */Zend_hash_del (EG (Active_symbol_ta BLE), varname, Varname_len + 1); /* Initializes the new Zval RefCount, and adds the variable information to the symbol table and relates its value to our new zval. */varcopy->refcount = 1; Varcopy->is_ref = 0; Zend_hash_add (EG (active_symbol_table), varname, Varname_len + 1,&varcopy, sizeof (zval*), NULL); /* Return the address of the new Zval */return varcopy;} 

The $b variable now has its own zval and is free to modify its value.

Change on Write copy

If a user explicitly lets a variable refer to another variable in a PHP script, how does our kernel handle it?

$a = 1;$b = &$a;$b += 5;  

As a standard PHP program ape, we all know that a $ A value also becomes 6. When we change the value of \ $b, the kernel discovers \ $b is a user-side reference of \ $a, that is, it can directly change the value of \ $b corresponding zval, without having to generate a new different and \ $a zval for it. Because he knows \ $a and \ $b want this change!

But how does the kernel know all this? Simply put, it is through the IS_REF__GC members of Zval to obtain this information. This member has only two values, just like a switch on and off. Its two states represent whether it is a user-defined reference in the PHP language. After execution of the first statement ($a = 1;),$a corresponds to the zval of refcount__gc equals 1,is_ref__gc equals 0; When the second statement executes ($b = &$A;), the Refcount__gc property grows to 2 as usual, and the IS_REF__GC property changes to 1 at the same time!

Finally, when the third statement is executed, the kernel checks the zval of $B again to determine if a new zval structure needs to be copied, and this time it does not need to be copied, because the Get_var_and_separate function above is actually a simplified version. and write down one less condition:

/* 如果这个zval在php语言中是通过引用的形式存在的,或者它的refcount小于2,则不许要复制。*/if ((*varval)->is_ref || (*varval)->refcount < 2) {    

This time, although its refcount equals 2, it is not duplicated because its is_ref equals 1. The kernel will directly modify the value of this zval.

Separation anxiety

We've learned something about the copying and referencing of variables in the PHP language, but what if the two events are copied and referenced together? Look at the following code:

$a = 1;$b = $a;$c = &$a;  

Here we can see that$A,$B,$C These three variables now share a ZVAL structure, with two belonging to the Change-on-write combination ($A,$c), There are two copy-on-write combinations ($A,$B), how do our is_ref__gc and refcount__gc work to properly handle this complex relationship?

The answer is: Impossible! In this case, the value of the variable must be separated into two parts completely independent of the existence! $A and $C share a zval,$B own a zval, although they have the same value, but must be implemented at least two zval. See "Forcing replication on reference!" 】



Similarly, the following code will also create ambiguity in the kernel, so you need to force replication!



It is important to note that in both cases, \ $b are associated with the original zval because the kernel does not know the name of the third variable when replication occurs.

47-Reference count vs. write-time replication

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.