In-depth understanding of PHP kernel (6) Function Definition, passing parameters and return values, and in-depth understanding of Kernel

Source: Internet
Author: User

In-depth understanding of PHP kernel (6) Function Definition, passing parameters and return values, and in-depth understanding of Kernel

I. Function Definition

The User function Definition starts with the function keyword, as follows:

function foo($var) {    echo $var;}

  1. Lexical Analysis

In Zend/zend_language_scanner.l, we find the following code:

<ST_IN_SCRIPTING>"function" {    return T_FUNCTION;}

It indicates that the function will generate the T_FUNCTION tag. After obtaining this tag, we start syntax analysis.

  2. syntax analysis

In the Zend/zend_language_parser.y file, find the declaration process mark of the function as follows:

function:    T_FUNCTION { $$.u.opline_num = CG(zend_lineno); }; is_reference:        /* empty */ { $$.op_type = ZEND_RETURN_VAL; }    |   '&'         { $$.op_type = ZEND_RETURN_REF; }; unticked_function_declaration_statement:        function is_reference T_STRING {zend_do_begin_function_declaration(&$1, &$3, 0, $2.op_type, NULL TSRMLS_CC); }            '(' parameter_list ')' '{' inner_statement_list '}' {                zend_do_end_function_declaration(&$1 TSRMLS_CC); };

Focus on function is_reference T_STRING, which indicates the function keyword, whether to reference, function name

The T_FUNCTION mark is only used to locate the declaration of a function, indicating that this function is used, and more work is related to this function, including parameters and return values.

  3. Generate intermediate code

After parsing the syntax, we can see that the compiled function is zend_do_begin_function_declaration. Find the implementation in the Zend/zend_complie.c file as follows:

Void token (znode * function_token, znode * function_name, int is_method, int return_reference, znode * fn_flags_znode TSRMLS_DC )/*{{{*/{... // omit function_token-> u. op_array = CG (active_op_array); lcname = equals (name, name_len); orig_interactive = CG (interactive); CG (interactive) = 0; init_op_array (& op_array, ZEND_USER_FUNCTION, TSRMLS _ CC); CG (interactive) = orig_interactive;... // omitting if (is_method) {... // omitting. Class methods are described in the following sections! Required GH} else {zend_op * opline = get_next_op (CG (active_op_array) TSRMLS_CC); opline-> opcode = ZEND_DECLARE_FUNCTION; opline-> op1.op _ type = IS_CONST; build_runtime_defined_function_key (& opline-> op1.u. constant, lcname, name_len TSRMLS_CC); opline-> op2.op _ type = IS_CONST; opline-> op2.u. constant. type = IS_STRING; opline-> op2.u. constant. value. str. val = lcname; opline-> op2.u. constant. value. str. len = name_len; Z_SET_REFCOUNT (opline-> op2.u. constant, 1); opline-> extended_value = ZEND_DECLARE_FUNCTION; zend_hash_update (CG (function_table), opline-> op1.u. constant. value. str. val, opline-> op1.u. constant. value. str. len, & op_array, sizeof (zend_op_array), (void **) & CG (active_op_array ));}}/*}}}*/

The generated code is ZEND_DECLARE_FUNCTION. Based on the intermediate code and the op_type corresponding to the operand. We can find that the execution function of the intermediate code is ZEND_DECLARE_FUNCTION_SPEC_HANDLER.

When generating the intermediate code, you can see that all function names are in lowercase, indicating that the function name is not case-sensitive.

To verify this implementation, let's look at a piece of code

function T() {    echo 1;} function t() {    echo 2;}

Fatal error: Cannot redeclare t () (previously declared in ...)

It indicates that T and t are the same function names for PHP. Check whether the function names are repeated. Where does this process proceed?

  4. Execute intermediate code

Find the execution function corresponding to the ZEND_DECLARE_FUNCTION intermediate code in the Zend/zend_vm_execute.h file: ZEND_DECLARE_FUNCTION_SPEC_HANDLER. This function only calls the do_bind_function function. The call code is as follows:

do_bind_function(EX(opline), EG(function_table), 0);

In this function, add the function pointed to by EX (opline) to EG (function_table) and determine whether a function with the same name already exists. If so, an error is reported. EG (function_table) used to store all function information during execution, which is equivalent to the function registry. Its structure is a HashTable, so the newly added function in the do_bind_function uses the HashTable operation function zend_hash_add.


Ii. Function Parameters

The function definition is just a process of registering a function name to the function list.

  1. User-Defined Function Parameters

We know that parameter checks for functions are implemented through the zend_do_receive_arg function. The key code for parameters in this function is as follows:

CG(active_op_array)->arg_info = erealloc(CG(active_op_array)->arg_info,        sizeof(zend_arg_info)*(CG(active_op_array)->num_args));cur_arg_info = &CG(active_op_array)->arg_info[CG(active_op_array)->num_args-1];cur_arg_info->name = estrndup(varname->u.constant.value.str.val,        varname->u.constant.value.str.len);cur_arg_info->name_len = varname->u.constant.value.str.len;cur_arg_info->array_type_hint = 0;cur_arg_info->allow_null = 1;cur_arg_info->pass_by_reference = pass_by_reference;cur_arg_info->class_name = NULL;cur_arg_info->class_name_len = 0;

The entire parameter is passed by assigning values to the arg_info field of the intermediate code. The key point is in the arg_info field. The structure of the arg_info field is as follows:

Typedef struct _ zend_arg_info {const char * name;/* parameter name */zend_uint name_len;/* Parameter name Length */const char * class_name; /* class name */zend_uint class_name_len;/* Class Name Length */zend_bool array_type_hint;/* array type prompt */zend_bool allow_null; /* Whether the value is NULL allowed */zend_bool pass_by_reference;/* Whether to reference and pass */zend_bool return_reference; int required_num_args;} zend_arg_info;

The difference between parameter value passing and parameter passing is achieved by the pass_by_reference parameter when generating intermediate code.

For the number of parameters, the arg_nums field contained in the intermediate code will add 1 to each execution of ** zend_do_receive_argxx. The following code:


And the index of the current parameter is CG (active_op_array)-> num_args-1. The following code:

cur_arg_info = &CG(active_op_array)->arg_info[CG(active_op_array)->num_args-1];

The above analysis is for parameter settings when the function is defined. These parameters are fixed. Variable parameters may be used in programming. In this case, the func_num_args and func_get_args functions are used. They exist as internal functions. Therefore, find the implementation of these two functions in the Zend \ zend_builtin_functions.c file. Let's first look at the implementation of the func_num_args function. The Code is as follows:

/* {{{ proto int func_num_args(void)   Get the number of arguments that were passed to the function */ZEND_FUNCTION(func_num_args){    zend_execute_data *ex = EG(current_execute_data)->prev_execute_data;     if (ex && ex->function_state.arguments) {        RETURN_LONG((long)(zend_uintptr_t)*(ex->function_state.arguments));    } else {        zend_error(E_WARNING,"func_num_args():  Called from the global scope - no function context");        RETURN_LONG(-1);    }}/* }}} */

If ex-> function_state.arguments exists and the function is called, The converted value ex-> function_state.arguments is returned. Otherwise, an error is displayed and-1 is returned. The key point here is EG (current_execute_data ). This variable stores the data of the current execution program or function. At this time, we need to retrieve the data of the previous execution program. Why? Because this function is called after the function is entered. The relevant data of the function is in the previous execution process, so the called is:

zend_execute_data *ex = EG(current_execute_data)->prev_execute_data;


  2. Internal function parameters

Take the common count function as an example. The code for processing the parameters is as follows:

/* {Proto int count (mixed var [, int mode]) Count the number of elements in a variable (usually an array) */PHP_FUNCTION (count) {zval * array; long mode = COUNT_NORMAL; if (zend_parse_parameters (ZEND_NUM_ARGS () TSRMLS_CC, "z | l", & array, & mode) = FAILURE) {return ;}... // omitted}

There are two operations: one is the number of parameters, and the other is the list of resolution parameters.

(1) number of parameters

The number of parameters is achieved through the ZEND_NUM_ARGS () Macro, which is defined as follows:

#define ZEND_NUM_ARGS()     (ht)

Ht is the ht in the macro INTERNAL_FUNCTION_PARAMETERS defined in Zend/zend. h file, as follows:

#define INTERNAL_FUNCTION_PARAMETERS int ht, zval *return_value,zval **return_value_ptr, zval *this_ptr, int return_value_used TSRMLS_DC

(2) Resolution parameter list

PHP internal functions use zend_parse_parameters when parsing parameters. It can greatly simplify parameter receiving and processing, although it is a little weak in processing variable parameters.

The statement is as follows:

ZEND_API int zend_parse_parameters(int num_args TSRMLS_DC, char *type_spec, ...)
  • The first num_args parameter indicates the number of parameters to be received. We often use ZEND_NUM_ARGS () to indicate the number of parameters to be received"
  • The second parameter should be macro TSRMLS_CC.
  • The third parameter type_spec is a string used to specify the types of parameters we expect to receive. It is a bit similar to the formatted string in the output format specified in printf.
  • The remaining parameter is the pointer to the variable we use to receive the PHP parameter value.

Zend_parse_parameters () tries its best to convert the parameter type while parsing the parameter, so that we can always get the expected type of variable.


  3. function return value

All functions in PHP have return values. If no return is returned, null is returned.

(1) return Statement

From the Zend/zend_language_parser.y file, you can check that the zend_do_return function is called when the intermediate code is generated.

Void zend_do_return (znode * expr, int ipvtsrmls_dc)/* {*/{zend_op * opline; int start_op_number, end_op_number; if (struct) {if (CG (active_op_array) -> return_reference &&! Callback (expr) {zend_do_end_variable_parse (expr, BP_VAR_W, 0 TSRMLS_CC);/* processing return reference */} else {callback (expr, BP_VAR_R, 0 TSRMLS_CC ); /* process regular variables and return */}}... // omitted, take other intermediate code to operate opline-> opcode = ZEND_RETURN; if (expr) {opline-> op1 = * expr; if (do_end_vparse & zend_is_function_or_method_call (expr )) {opline-> extended_value = ZEND_RETURNS_FUNCTION ;}} else {opline-> op1.op _ type = IS_CONST; INIT_ZVAL (opline-> op1.u. constant);} SET_UNUSED (opline-> op2 );}/*}}}*/

The intermediate code is ZEND_RETURN. When the first operand type returns an available expression, its type is the operation type of the expression; otherwise, its type is IS_CONST. This is often used in subsequent computation and execution of intermediate code functions. The ZEND_RETURN intermediate code executes ZEND_RETURN_SPEC_CONST_HANDLER, ZEND_RETURN_SPEC_TMP_HANDLER, or ZEND_RETURN_SPEC_TMP_HANDLER according to the different operands. The execution process of these three functions is similar, including handling some errors. Here we take ZEND_RETURN_SPEC_CONST_HANDLER as an example to illustrate the execution process of the function return value:

static int ZEND_FASTCALL  ZEND_RETURN_SPEC_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS){    zend_op *opline = EX(opline);    zval *retval_ptr;    zval **retval_ptr_ptr;      if (EG(active_op_array)->return_reference == ZEND_RETURN_REF) {         //  ǓǔŷsÁ\ɁƶMļ@ɗÁĻļ        if (IS_CONST == IS_CONST || IS_CONST == IS_TMP_VAR) {               /* Not supposed to happen, but we'll allow it */            zend_error(E_NOTICE, "Only variable references \                should be returned by reference");            goto return_by_value;        }         retval_ptr_ptr = NULL;  //  ǓǔŔ         if (IS_CONST == IS_VAR && !retval_ptr_ptr) {            zend_error_noreturn(E_ERROR, "Cannot return string offsets by reference");        } if (IS_CONST == IS_VAR && !Z_ISREF_PP(retval_ptr_ptr)) {            if (opline->extended_value == ZEND_RETURNS_FUNCTION &&                EX_T(opline->op1.u.var).var.fcall_returned_reference) {            } else if (EX_T(opline->op1.u.var).var.ptr_ptr ==                    &EX_T(opline->op1.u.var).var.ptr) {                if (IS_CONST == IS_VAR && !0) {                      /* undo the effect of get_zval_ptr_ptr() */                    PZVAL_LOCK(*retval_ptr_ptr);                }                zend_error(E_NOTICE, "Only variable references \                 should be returned by reference");                goto return_by_value;            }        }         if (EG(return_value_ptr_ptr)) { //  Ǔǔŷs            SEPARATE_ZVAL_TO_MAKE_IS_REF(retval_ptr_ptr);   //  is_ref__gcőę1            Z_ADDREF_PP(retval_ptr_ptr);    //  refcount__gcŒď×1             (*EG(return_value_ptr_ptr)) = (*retval_ptr_ptr);        }    } else {return_by_value:         retval_ptr = &opline->op1.u.constant;         if (!EG(return_value_ptr_ptr)) {            if (IS_CONST == IS_TMP_VAR) {             }        } else if (!0) { /* Not a temp var */            if (IS_CONST == IS_CONST ||                EG(active_op_array)->return_reference == ZEND_RETURN_REF ||                (PZVAL_IS_REF(retval_ptr) && Z_REFCOUNT_P(retval_ptr) > 0)) {                zval *ret;                 ALLOC_ZVAL(ret);                INIT_PZVAL_COPY(ret, retval_ptr);   //  ŁͿʍǓǔŔ                 zval_copy_ctor(ret);                *EG(return_value_ptr_ptr) = ret;            } else {                *EG(return_value_ptr_ptr) = retval_ptr; //  ħ6ɶŔ                Z_ADDREF_P(retval_ptr);            }        } else {            zval *ret;             ALLOC_ZVAL(ret);            INIT_PZVAL_COPY(ret, retval_ptr);    //  ŁͿʍǓǔŔ             *EG(return_value_ptr_ptr) = ret;            }    }     return zend_leave_helper_SPEC(ZEND_OPCODE_HANDLER_ARGS_PASSTHRU);   //  Ǔǔĉșʒ}

The Return Value of the function is stored in * EG (return_value_ptr_ptr) during program execution ). The ZEND kernel distinguishes the returned value from the referenced return value, and processes constants, temporary variables, and other types of variables differently when they are returned. After the return statement is executed, the ZEND kernel calls the zend_leave_helper_SPEC function to clear the variables used in the function. This is one of the reasons why the ZEND kernel automatically adds NULL to the function.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.