Source: CSDN http://www.csdn.net/article/2014-09-15/2821685-exploring-of-the-php
Wang Shuai
Abstract: PHP as a simple and powerful language, can provide a lot of web-based language features, and from this issue of "bottom" beginning, Wang Shuai will start from the practice, take you to understand some of the PHP kernel common parts, such as the "weak type variable principle."
PHP is a simple and powerful language, providing a lot of web-appropriate language features, including the variable weak type, under the weak type mechanism, you can assign any type of value to a variable.
PHP is executed by Zend Engine (hereinafter referred to as Ze), Ze is written in C, and a weak type mechanism is implemented at the bottom. Ze memory management uses optimization strategies such as write-time copy, reference counting, and so on to reduce the memory copy when the re-variable is assigned.
The following not only takes you to explore the principle of PHP weak type, but also in the writing PHP extension angle, describes how to manipulate PHP variables.
1. Types of variables in PHP
There are 8 types of variables in PHP:
- Standard type: Boolean Boolen, Integer integer, floating-point float, character string
- Complex Type: Array, object
- Special Type: Resource Resource
PHP does not rigorously validate variable types, and variables can declare their types without showing them, and they are assigned directly during run time. You can also convert a variable to a free type. In the following example, $i can assign a value of any type without implementing a declaration.
[PHP] View plaincopy
- <? php $i = 1; //int $i = ' Show me the money '; string $i = 0.02; float $i = Array (1, 2, 3); Array $i = new Exception (' Test ', 123); Object $i = fopen ('/tmp/aaa.txt ', ' a ')//resource?>
If you are not deeply aware of the weak type principle, there will be a "more than expected" surprise when the variable is compared.
[PHP] View plaincopy
- <? Php$STR 1 = null; $str 2 = false; echo $str 1== $str 2 ? ' equal ' : ' unequal '; $str 3 = $str 4 = 0; echo $str 3== $str 4 ? $str 5 = 0; $str 6 = ' 0 '; echo $str 5== $str 6 ?
The above three results are all equal, because in the variable comparison, PHP internal variable conversion. If you want values and types to be judged at the same time, use three = (for example, $a ===0) to determine. Perhaps you will feel very common, perhaps you will feel very magical, then please join me in the PHP kernel, explore the principle of PHP variables.
2. Storage of variables and introduction of standard types
All of PHP's variables are implemented as struct zval, and in zend/zend.h we can see the definition of Zval:
[PHP] View plaincopy
- typedef Union _zvalue_value {Long lval;/* Long value */double dval;/* double value */ struct { char *val; int len; /* This'll always is set for strings */} str; /* String (always have length) */HashTable *ht; / * An array */Zend_object_value obj; / * Stores an object store handle, and handlers */} Zvalue_value;
Property name |
Meaning |
Default value |
refcount__gc |
Represents a reference count |
1 |
is_ref__gc |
Indicates whether it is a reference |
0 |
Value |
Value of the stored variable |
|
Type |
Variable-specific type |
|
Where refcount__gc and is_ref__gc indicate whether a variable is a reference. The Type field identifies the variable, and the value of type can be: Is_null,is_bool,is_long,is_float,is_string,is_array,is_object,is_resource. PHP chooses how to store to Zvalue_value, depending on the type.
Zvalue_value is able to implement the core of a variable's weak type, defined as follows:
[PHP] View plaincopy
- typedef Union _zvalue_value {Long lval;/* Long value */double dval;/* Double value */struct {char *val; int Len;/* This'll always is set for strings */} str; /* String (always have length) */HashTable *ht; / * An array */Zend_object_value obj; / * Stores an object store handle, and handlers */} Zvalue_value;
Boolean, Zval.type=is_bool, reads the Zval.value.lval field with a value of 1/0. If it is a string, zval.type=is_string will read the ZVAL.VALUE.STR, which is a struct that stores the string pointer and length.
In the C language, use "\0″" as the string terminator. That is, a string "Hello\0world" in the C language, with printf output, can only output Hello, because "\0″ will think that the character has ended." In PHP, the length of the string is controlled by the _zval_value.str.len of the struct, and the correlation function does not encounter the "\0″ end." So the PHP string is binary safe.
If it is NULL, only zval.type=is_null is required and the value is not read.
Through the encapsulation of Zval, PHP implements a weak type, and for ze, any type can be accessed by Zval.
3. Advanced types Array and object arrays array
An array is a very powerful data structure in the PHP language, divided into indexed arrays and associative arrays, Zval.type=is_array. Each key in the associative array can store any type of data. The array of PHP is implemented with hash table, and the value of the array exists in zval.value.ht.
The implementation of the PHP hash table is discussed later.
Zval.type=is_object of the object type, the value exists in Zval.value.obj.
4. Special type--resource type (Resource) Introduction
The resource type is a very special type, zval.type=is_resource, there are some data structures in PHP that are difficult to describe with regular types, such as a file handle, a pointer to C, but there is no concept of pointers in PHP and cannot be constrained by regular types. Therefore, PHP uses the concept of resource type, the C language similar to the file pointer variables, using the ZVAL structure to encapsulate. The resource type value is an integer, and ze is taken from the hash table of the resource according to that value.
Definition of resource type:
[PHP] View plaincopy
- typedefstruct_zend_rsrc_list_entry {void *ptr; int type; int refcount; }zend_rsrc_list_entry;
Where PTR is a pointer to the final implementation of a resource, such as a file handle, or a database connection structure. Type is a kind of tag that distinguishes between different resource types. RefCount the reference count used for the resource.
In the kernel, the resource type is obtained through the function Zend_fetch_resource.
[PHP] View plaincopy
- Zend_fetch_resource (Con, type, Zval *, Default, Resource_name, Resource_type);
5. Conversion of variable types
According to our understanding of the PHP language now, the type of the variable depends on the Zval.type field indication, and the contents of the variable are stored in Zval.type to Zval.value. When a variable is required in PHP, it only takes two steps: Change the Zval.value value or pointer, and then change the type of Zval.type. However, for some of the high-level variables in PHP array/object/resource, the variable conversion will do more.
The principle of variable conversion is divided into 3 types:
5.1 Standard types Convert each other
Relatively simple, follow the above steps to convert.
5.2 Standard types and resource type conversions
Resource types can be interpreted as int, which makes it easier to convert standard types. Resources are either close or recycled after the conversion.
[PHP] View plaincopy
- <? php $var = fopen ('/tmp/aaa.txt ', ' a '); //resource #1 $var = (int) $var; var_dump ($var); Output 1?>
5.3 Standard types and complex type conversions
Array conversion integer int/float float Returns the number of elements, converting bool returns an array with an element, converting to a string returning ' Array ', and throwing warning.
Please read the PHP manual for more information on the experience: http://php.net/manual/en/language.types.type-juggling.php
5.4 Complex Types convert each other
The array and object can be transferred to each other. If any other type of value is converted to an object, an instance of the built-in class StdClass will be created.
When we write PHP extensions, the PHP kernel provides a set of functions for type conversions:
void Convert_to_long (zval* pzval) |
void Convert_to_double (zval* pzval) |
void Convert_to_long_base (zval* pzval, int base) |
void Convert_to_null (zval* pzval) |
void Convert_to_boolean (zval* pzval) |
void Convert_to_array (zval* pzval) |
void Convert_to_object (zval* pzval) |
void Convert_object_to_type (zval* pzval, convert_func_t Converter) |
The PHP kernel provides a set of macros for easy access to the zval for finer granularity in obtaining zval values:
Kernel Access API for Zval containers |
Macro |
accessing variables |
Z_lval (Zval) |
(zval). Value.lval |
Z_dval (Zval) |
(zval). Value.dval |
Z_strval (Zval) |
(zval). Value.str.val |
Z_strlen (Zval) |
(zval). Value.str.len |
Z_arrval (Zval) |
(Zval). Value.ht |
Z_type (Zval) |
(zval). Type |
Z_lval_p (Zval) |
(*zval). Value.lval |
Z_dval_p (Zval) |
(*zval). Value.dval |
Z_strval_p (zval_p) |
(*zval). Value.str.val |
Z_strlen_p (zval_p) |
(*zval). Value.str.len |
Z_arrval_p (zval_p) |
(*zval). Value.ht |
Z_obj_ht_p (zval_p) |
(*zval). value.obj.handlers |
Z_LVAL_PP (ZVAL_PP) |
(**zval). Value.lval |
Z_DVAL_PP (ZVAL_PP) |
(**zval). Value.dval |
Z_STRVAL_PP (ZVAL_PP) |
(**zval). Value.str.val |
Z_STRLEN_PP (ZVAL_PP) |
(**zval). Value.str.len |
Z_ARRVAL_PP (ZVAL_PP) |
(**zval). Value.ht |
6. Symbol table and scope of variables
PHP's variable symbol table and Zval value mapping, is through the Hashtable (hash table, also known as the hash list, hereinafter referred to as HT), Hashtable widely used in ze, including constants, variables, functions and other language features are HT to organize, The array type in PHP is also implemented by Hashtable.
As an example:
[PHP] View plaincopy
- <? php $var = ' Hello world ';?>
The variable names of the $var are stored in the variable symbol table, and the zval structure that represents the $var type and value is stored in the hash table. The kernel uses the variable symbol table and the hash map of the Zval address to implement the PHP variable access.
Why should we mention scopes? Because the function internal variable is protected. The variables in the scope PHP are divided into global variables and local variables, and each scope PHP maintains the hashtable of a symbol table. When creating a function or class in PHP, Ze creates a new symbol table that indicates that the variable in the function or class is a local variable, which enables the protection of local variables-external access to variables inside the function. When creating a PHP variable, ze assigns a zval and sets the corresponding type and initial value, adding the variable to the current scope's symbol table so that the user can use the variable.
The kernel uses Zend_set_symbol to set variables:
[PHP] View plaincopy
- Zend_set_symbol (EG (active_symbol_table), "foo", foo);
View _zend_executor_globals Structure
[PHP] View plaincopy
- Zend/zend_globals.h
- struct _zend_executor_globals { //slightly HashTable symbol_table;//symbol table for global variables HashTable *active_symbol_ table;//the symbol table for local variables//slightly};
When writing PHP extensions, you can use the EG macro to access the PHP variable symbol table. EG (symbol_table) accesses the variable symbol table of the global scope, eg (active_symbol_table) accesses the variable symbol table of the current scope, and the local variable stores the pointer, which is passed to the corresponding function when the hashtable is manipulated.
To better understand the hash table and scope of the variable, give a simple example:
[PHP] View plaincopy
- <? php $temp = ' global '; function Test () { $temp = ' active ';} test (); Var_dump ($temp);?>
Creating a variable $temp outside of a function will add this to the global symbol table and, in the hashtable of the global symbol table, assign a character type of zval with a value of ' global '. Create the function test internal variable $temp, it will be added to the symbol table belonging to the function test, assign the character type Zval, the value is ' active '.
7. php extension in variable operation create PHP variable
We can call the function Make_std_zval (PZV) in the extension to create a PHP callable variable, make_std_zval to which the macro is applied:
[PHP] View plaincopy
- #define MAKE_STD_ZVAL (ZV) alloc_zval (ZV) init_pzval (ZV) #define ALLOC_ZVAL (z) ZE Nd_fast_alloc (z, Zval, zval_cache_list) #define ZEND_FAST_ALLOC (p, type, Fc_type) (P) = (type *) Emalloc (Sizeo f (type)) #define INIT_PZVAL (z) (z)->refcount__gc = 1; (z)->is_ref__gc = 0;
Make_std_zval (foo) expands and gets:
[PHP] View plaincopy
- (foo) = (Zval *) emalloc (sizeof (zval)); (foo)->refcount__gc = 1; (foo)->is_ref__gc = 0;
As you can see, Make_std_zval has done three things: allocating memory, initializing RefCount, is_ref in the zval structure.
Some macros are available in the kernel to simplify our operations, and you can set the type and value of the zval in just one step.
API Macros for accessing Zval |
Macro |
Implementation method |
Zval_null (PVZ) |
Z_type_p (PZV) = Is_null |
Zval_bool (PVZ) |
Z_type_p (PZV) = Is_bool; Z_bval_p (PZV) = b? 1:0; |
Zval_true (PVZ) |
Zval_bool (Pzv, 1); |
Zval_false (PVZ) |
Zval_bool (Pzv, 0); |
Zval_long (PvZ, L) (L is the value) |
Z_type_p (PZV) = Is_long; Z_lval_p (PZV) = l; |
Zval_double (PvZ, D) |
Z_type_p (PZV) = is_double; Z_lval_p (PZV) = D; |
Zval_stringl (PvZ, str, Len, DUP) |
Z_type_p (PZV) = is_string; Z_strlen_p (PZV) = Len; if (DUP) { {z_strval_p (PZV) =estrndup (str, len + 1);} }else { {z_strval_p (PZV) = str;} } |
Zval_string (PvZ, str, len) |
Zval_stringl (Pzv, Str,strlen (str), DUP); |
Zval_resource (PvZ, RES) |
Z_type_p (PZV) = Is_resource; Z_resval_p (PZV) = res; |
The DUP parameter in Zval_stringl (Pzv,str,len,dup)
First elaborated Zval_stringl (Pzv,str,len,dup); The STR and Len two parameters are well understood, because we know that the address of the string is stored in the kernel and its length, and the meaning of the following DUP is very simple, which indicates whether the string needs to be copied. A value of 1 will first request a new memory and assign the string, and then copy the address of the new memory to Pzv, 0 is the address of the STR is directly assigned to Zval.
The difference between Zval_stringl and zval_string
If you want to intercept the string in a certain location or you already know the length of the string, you can use the macro Zval_stringl (Zval, string, length, duplicate), which explicitly specifies the string length instead of using strlen (). This macro is the length of the string as a parameter. But it is binary safe, and the speed is faster than zval_string, because there is less a strlen.
Zval_resource is approximately equal to Zval_long
As we said in Chapter 4, the value of the resource type in PHP is an integer, so zval_resource and Zval_long work the same, except that it sets the Zval type to Is_resource.
8. Summary
The weak type of PHP is accomplished by Ze's zval container conversion, which stores variable names and zval data through a hash table, with some sacrifice in terms of operational efficiency. In addition, because of the implicit conversion of variable types, it is not enough to detect variable types during development, which can cause problems.
However, PHP's weak type, array, memory hosting, extension and other language features, very suitable for web development scenarios, development efficiency is high, can accelerate product iteration cycle. In a massive service, bottlenecks usually exist in the data access layer, not the language itself. In the actual use of PHP not only as a logical layer and presentation layer of the task, we even use PHP developed udpserver/tcpserver as the data and the middle layer of the cache.
"Ask the Bottom" Wang Shuai: in-depth PHP kernel (i)--research on the principle of weakly typed variables