"Translate" PHP variable implementation (to PHP developer PHP Source-part 31st)

Source: Internet
Author: User
Tags case statement parse string
"Translate" PHP variable implementation (to PHP developer PHP Source-Part III)

Article from: http://www.aintnot.com/2016/02/12/phps-source-code-for-php-developers-part3-variables-ch

Original: http://blog.ircmaxell.com/2012/03/phps-source-code-for-php-developers_21.html

In the third article in the "PHP Source for PHP Developers" series, we intend to extend the previous article to help understand how PHP works internally. In the first article, we showed you how to view the source code of PHP, how it is structured, and the C pointer base that is introduced to PHP developers. The second article describes the functions. This time, we're going to dive into one of the most useful constructs of PHP: variables.

Enter Zval

In the core code of PHP, variables are called ZVAL . This structure is so important for a reason, not least because PHP uses a weak type and C uses strongly typed. So how did zval solve the problem? To answer this question, we need to look carefully at the definition of the Zval type. To see this definition, let's try searching for zval in the definition search box on the LXR page. At first glance, we seem to be unable to find anything useful. But there is a row typedef in the Zend.h file (typedef in C is a way to define a new data type). This may be what we're looking for, and we'll continue to look at it. Originally, this seems irrelevant. There is no useful thing here. But in order to confirm some, let's click on _zval_struct this line.

1 struct _zval_struct {2*/*/3* /* 4 Zend_uint refcount__gc; 5 /*  */6Zend_uchar is_ref__gc; 7 };

Then we get the basics of PHP, Zval. It looks simple, doesn't it? Yes, yes, but there are some magical things that make sense. Note that this is a struct or struct. Basically, this can be seen as a class in PHP that has only public properties. Here, we have four properties: value , refcount__gc type as well is_ref__gc . Let's take one by one to view these properties (omitting their order).

Value

The first element we're talking about is the value variable, which is the type zvalue_value . I don't know you, but I've never heard of it zvalue_value . So let's try to figure out what it is. As with the rest of the site, you can click on a type to see its definition. Once you've clicked, you'll see that it's defined the same as the following:

typedef Union _ZVALUE_VALUE {    long* ** *    double/* */ struct {char * val;                        int len;    } STR;     /*  */    zend_object_value obj;} zvalue_value;

Now, there are some black tech here. See the definition of the Union? That means it's not really a struct, it's a separate type. But there are multiple types of variables inside. If there are multiple types in this, how can it be a single type? I'm glad you asked this question. To understand this, we need to recall the type of C language we talked about in the first article.

In C, a variable is simply a label for a row of memory addresses. It can also be said that the type is just the way to identify which piece of memory will be used. Nothing in C separates the 4-byte string from the integer value. They are all just a whole chunk of memory. The compiler tries to resolve it by identifying the memory segment as a variable and then converting those variables to a specific type, but this is not always successful (by the way, when a variable "overrides" the memory segment it gets, it will produce a segment error).

So, as far as we know, union is a separate type, which is interpreted in different ways depending on how it is accessed. This allows us to define a value to support multiple types. One thing to note is that all types of data must be stored using the same piece of memory. In this example, a 64-bit compiler, a long and a double will take up 64 bits to save. The string structure takes 96 bits (64 bits to store character pointers and 32 bits to hold the integer length). hash_tablewill occupy 64 bits, and zend_object_value will take up 96 bits (32 bits to store the element, and the remaining 64 bits to store the pointer). The entire Union takes up the maximum memory size of the element, so here it is 96 bits.

Now, if we look at this union again, we can see that there are only 5 PHP data types here (long = = Int,double = Float,str = = String,hashtable = = Array,zend_object_ Value = = object). So where does the rest of the data type go? Originally, this struct is sufficient to store the remaining data types. BOOL is stored by using long (int), NULL without consuming data segments, RESOURCE or by using long.

TYPE

Because this value consortium does not control how it is accessed, we need other ways to record the type of the variable. Here, we can use the data type to derive information about how to access value. It uses this byte of type to handle the problem (an zend_uchar unsigned character, or a byte in memory). It retains this information from the Zend type constants. This is really a kind of magic that needs to be used zval.type = IS_LONG to define integer data. So this field and the Value field are enough to let us know the type and value of the PHP variable.

Is_ref

This field identifies whether the variable is a reference. That is to say, if you perform the execution in the variable $foo = &$bar . If it is 0, then the variable is not a reference, and if it is 1, then the variable is a reference. It did not do too much of things. So, before we finish _zval_struct , take a look at its fourth member.

RefCount

This variable is a counter to the pointer to the PHP variable container. That is, if RefCount is 1, it means that there is a PHP variable that uses this container. If RefCount is 2, it means that there are two PHP variables pointing to the same variable container. A separate refcount variable does not have much useful information, but if is_ref used together, it forms the basis of the garbage collector and copy-on-write. It allows us to use the same zval container to hold one or more PHP variables. The semantic interpretation of refcount is beyond the scope of this article, and if you want to go further, I recommend that you review this document.

This is all the content of Zval.

How does it work?

Inside PHP, Zval is passed to the function as a memory segment or as a pointer to a memory segment (or pointers to pointers, and so on), as with other C variables. Once we have the variable, we want to access the data inside it. How are we going to do that? We use macros defined in zend_operators.h the file to work with Zval, making it easier to access data. It is important to note that each macro has multiple copies. The difference is their prefix. For example, to derive the type of zval, there is a Z_TYPE(zval) macro that returns an integer data to represent the Zval parameter. But there is also a Z_TYPE(zval_p) macro, which Z_TYPE(zval) is the same as doing things, but it returns a pointer to Zval. In fact, except for the properties of the parameters, the two functions are the same, in fact, we can use them Z_TYPE(*zval_p) , but _p and _pp make things easier.

We can use the Val-class macro to get the value of the Zval. Can be called Z_LVAL(zval) to get an integer value (such as Integer data and resource data). Z_DVAL(zval)The call came to the floating-point value. There's a lot of other things to do here. The key to note is that in order to get the value of Zval in C, you need to use a macro (or should). So, when we see a function using them, we know that it is extracting its value from the Zval.

So, what about the type?

So far, our knowledge has talked about the values of type and zval. As we all know, PHP has helped us make type judgments. So, if we like, we can use a string as an integer value. We call this a step convert_to_type . To convert a zval to a string value, the convert_to_string function is called. It will change the type of zval that we pass to the function. So, if you see a function calling these functions, you know that it is the data type of the transformation parameter.

Zend_parse_paramenters

This function is described in the previous article zend_parse_paramenters . Now that we know how PHP variables are represented in C, let's take a closer look.

int zend_parse_parameters (intconstChar *type_spec, ...) {    va_list va;     int retval;     0 );    Va_start (VA, type_spec);     0 tsrmls_cc);    Va_end (VA);     return retval;}

Now, on the surface, this looks confusing. The point to understand is that the va_list type is just a variable argument list using ' ... '. Therefore, it is similar to the structure of a function in PHP func_get_args() . With this thing, we can see the zend_parse_parameters function calling the zend_parse_va_args function immediately. Let's go down and look at this function ...

This function looks very interesting. At first glance, it seems to have done a lot of things. But take a closer look. First, we can see a for loop. This for loop mainly iterates through zend_parse_parameters the strings passed from type_spec . In the loop we can see that it just calculates the number of parameters that are expected to be received. The study of how it does this is left to the reader.

Keep looking down, I can see some reasonable checks (check that the parameters are passed correctly), and error checking to check if a sufficient number of parameters have been passed. Next go into a loop that we're interested in. This loop really parses those parameters. Inside the loop, we can see that there are three if statements. The first identifier to handle an optional parameter. The second processing var-args (number of parameters). The third if statement is exactly what we are interested in. As you can see, the function is called here zend_parse_arg() . Let's take a closer look at this function ...

Keep looking down and we can see there are some very interesting things here. This function calls another function (Zend_parse_arg_impl) and then gets some error information. This is a very common pattern in PHP that extracts the function's error-handling work to the parent function. This makes the code implementation and error handling separate and can be reused to the maximum. You can go further into that function, which is very easy to understand. But let's take a closer look ... zend_parse_arg_impl()

Now, we really have the steps to parse the parameters in the PHP intrinsics. Let's take a look at the branch of the first switch statement, which is used to parse integer parameters. The next step should be easy to understand. So, let's start with the first line of the branch:

Long Long *);

If you remember what we said earlier, Va_args is the way the C language handles variable parameters. So here is the definition of an integer pointer (long in C is an integer). In short, it gets pointers from the Va_arg function. This shows that it gets a pointer to the arguments passed to the Zend_parse_parameters function. So this is the result of the pointer we will assign to the value after the branch ends. Next, we can see a branch that goes into a type based on the passed in variable (zval). Let's look at the IS_STRING branch first (this step is performed when passing an integer value to a string variable).

 Caseis_string:{DoubleD; inttype; if(type = is_numeric_string (z_strval_pp (ARG), Z_STRLEN_PP (ARG), p, &d,-1)) ==0) {        return "Long"; } Else if(Type = =is_double) {        if(c = ='L') {            if(D >Long_max) {                *p =Long_max;  Break; } Else if(D <long_min) {                *p =long_min;  Break; }        }        *p =Zend_dval_to_lval (d); }} Break;

Now, this does not look as much as it does. All things are attributed to is_numeric_string functions. In general, the function checks whether the string contains only integer characters and returns 0 if it is not. If so, it parses the string into a variable (integer or float, p or D) and returns the data type. So we can see that if the string is not a pure number, he returns a "long" string. This string is used to wrap the error handling function. Otherwise, if the string represents a double (float), it checks whether the floating-point number is too large to be stored as an integer and then uses the zend_dval_to_lval function to help resolve the floating-point number to the integer. That's all we know. We have parsed our string arguments. Now keep looking at the other branches:

 Caseis_double:if(c = ='L') {        if(Z_DVAL_PP (ARG) >Long_max) {            *p =Long_max;  Break; } Else if(Z_DVAL_PP (ARG) <long_min) {        *p =long_min;  Break; }} CaseIs_null: CaseIs_long: CaseIS_BOOL:CONVERT_TO_LONG_EX (ARG);*p =z_lval_pp (ARG); Break;

Here, we can see the operation of resolving floating-point numbers, which is similar to the floating-point numbers in the parse string (coincidental?). )。 One important thing to note is that if the parameter's identity is not uppercase ' L ', it will be treated the same way as other types of variables (the case statement has no break). Now, we also have an interesting function, CONVERT_TO_LONG_EX (). This is a class of Convert_to_type () function sets that we talked about earlier, and the function transforms the parameter to a specific type. The only difference is that if the argument is not a reference (because the function is changing the data type), the function separates (copies) the value of the variable and its reference. (The only difference was that it separates (copies) the passed in variable if it's not a reference (since it ' s changing th E type). This is the role of copy-on-write. So when we pass a floating-point number to a non-referenced integer variable, the function treats it as an integer, but we can still get the floating-point data.

 Case Is_array:  Case Is_object:  Case Is_resource: default : return " Long ";

Finally, we have another three case branches. We can see that if you pass an array, object, resource, or other unknown type to an integer variable, you get an error.

The rest of the sections we leave to the reader. Reading zend_parse_arg_impl function is really useful for better understanding the amount of PHP type judging system. Read part of it, and then try to track the status and type of various parameters in C.

Next section

The next section will be in Nikic's blog (we'll jump back and forth in this series of articles). In the next article, he talks about all the contents of the array.

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.