In-depth understanding of ini configuration in php (1 ). In-depth understanding of ini configuration in php (1) This article will not detail the purpose of an ini configuration item, which has been explained in the manual. I just want to dig into the ini configuration in php from a specific point of view (1)
This article does not describe the purpose of an ini configuration item in detail. these are all covered in the manual. I just want to explore the implementation mechanism of php from a specific point of view, which involves some knowledge about the php kernel :-)
All php users know that the php. ini configuration takes effect throughout the SAPI lifecycle. During the execution of a php script, if you manually modify the ini configuration, it will not start. If you cannot restart apache or nginx, you can only explicitly call the ini_set interface in php code. Ini_set is a function provided by php to dynamically modify the configuration. it should be noted that the time range for ini_set to take effect is different from the configuration set in the INI file. After the php script is executed, the ini_set setting will expire immediately.
Therefore, this article will be divided into two parts: the first part describes the principles of php. ini configuration, and the second part describes how to dynamically modify php configuration.
The configuration of php. ini involves three pieces of data: configuration_hash, EG (ini_directives), PG, BG, PCRE_G, JSON_G, and XXX_G. If you do not know the meaning of the three types of data, the following will explain in detail.
1. parse the INI configuration file
Because php. ini must always take effect in the SAPI process, parsing the ini file and constructing the php configuration accordingly must be the beginning of SAPI. In other words, it must occur in the php startup process. Php has generated these configurations before any actual request arrives.
Reflected to the php kernel, which is the php_module_startup function.
Php_module_startup is mainly responsible for starting php. it is usually called at the beginning of SAPI. Btw, another common function is php_request_startup, which initializes each request at the time of arrival. php_module_startup and php_request_startup are two identification actions, however, the analysis is not within the scope of this article.
For example, when php is attached to an apache module, apache activates all these modules, including the php module. When activating the php module, php_module_startup is called. The php_module_startup function has done a lot of work. Once the call to php_module_startup is over, it means OK, php has been started, and now the request can be accepted and responded.
In the php_module_startup function, the implementation related to parsing the INI file is:
/* This will read in php. ini, set up the configuration parameters,
Load zend extensions and register php function extensions
To be loaded later */
If (php_init_config (TSRMLS_C) = FAILURE ){
Return FAILURE;
}
We can see that the php_init_config function is called to complete the parse of The INI file. Parse mainly analyzes lex & grammar and extracts and saves key and value pairs in the INI file. The format of php. ini is very simple. the key on the left side of the equals sign and value on the right side. Where does php store a pair of kv pairs after they are extracted? The answer is the previously mentioned configuration_hash.
Static HashTable configuration_hash;
Configuration_hash is declared in php_ini.c. it is a HashTable data structure. As the name implies, it is actually a hash table. In other words, php5.3 and earlier versions cannot obtain configuration_hash, because it is a static variable in the php_ini.c file. Later, php5.3 added the php_ini_get_configuration_hash interface, which directly Returned & configuration_hash, allowing various extensions of php to easily see the full picture of configuration_hash...
Note:
First, php_init_config does not perform any verification except the lexical syntax. That is to say, if we add a line "hello = world" to the INI file, as long as this is a correct configuration item, the final configuration_hash will contain an element with the key "hello" and the value "world, configuration_hash reflects the INI file to the maximum extent.
Second, the INI file allows us to configure it as an array. For example, write the following three lines in the INI file:
Drift. arr [] = 1
Drift. arr [] = 2
Drift. arr [] = 3
In the final generated configuration_hash table, an element whose key is drift. arr will exist, and its value will be an array containing 1, 2, and 3 numbers. This is an extremely rare configuration method.
Third, php also allows us to build some ini files in addition to the default php. ini file (php-% s. ini. These INI files are placed in an additional directory. This directory is specified by the environment variable PHP_INI_SCAN_DIR. after php_init_config has parsed php. ini, it will scan the directory again and find all. ini files in the directory for analysis. The kv key-value pairs generated in these extra INI files will also be added to configuration_hash.
This is an occasional useful feature. if we develop php extensions by ourselves, we do not want to mix the configurations into php. ini, you can write another ini, and use PHP_INI_SCAN_DIR to tell php where to find it. Of course, its disadvantage is also obvious. it needs to set additional environment variables for support. A better solution is that the developer calls php_parse_user_ini_file or zend_parse_ini_file in the extension to parse the corresponding INI file.
Fourth, in configuration_hash, if the key is a string, what is the value type? The answer is also a string (except for the above special arrays ). Specifically, for example, the following configuration:
Display_errors = On
Log_errors = Off
Log_errors_max_len = 1024
The actual key-value pairs in the final configuration_hash are:
Key: "display_errors"
Val: "1"
Key: "log_errors"
Val :""
Key: "log_errors_max_len"
Val: 1024"
Note that log_errors does not store the value "0". it is a real empty string. In addition, log_errors_max_len is not a number, but a string of 1024.
At this point, the contents related to parsing the INI file are clearly explained. Summary:
1. parse ini in the php_module_startup phase
2. the parsing result is stored in configuration_hash.
2. Role of configuration to the module
The general structure of php can be seen as a zend Engine at the bottom layer, which is responsible for interacting with OS, compiling php code, providing memory hosting, etc. on the upper layer of the zend Engine, many modules are arranged. Among them, the Core is a Core module, and other modules, such as Standard, PCRE, Date, Session, etc., are also called php extensions. We can simply understand that each module provides a set of functional interfaces for developers to call. for example, common built-in functions such as explode, trim, and array are used, it is provided by the Standard module.
Why do we need to talk about this. in ini, apart from php, some Core module configurations (such as safe_mode, display_errors, and max_execution_time), there are a lot of configurations for different modules.
For example, the date module provides common functions such as date, time, and strtotime. In php. ini, its related configurations are as follows:
[Date]
; Date. timezone = 'Asia/Shanghai'
; Date. default_latitude = 31.7667
; Date. default_longpolling = 35.2333
; Date. sunrise_zenith = 90.583333
; Date. sunset_zenith = 90.583333
In addition to the independent configurations of these modules, the zend Engine is also configurable, except that the zend Engine has very few configuration items, including error_reporting, zend. enable_gc, and detect_unicode.
As we mentioned in the previous section, php_module_startup will call php_init_config to parse the INI file and generate configuration_hash. So what else will be done in php_module_startup? Obviously, the configuration in configuration_hash is applied to different modules such as Zend, Core, Standard, and SPL. Of course, this is not a one-stop process, because php usually contains many modules, and these modules will be started in sequence during php startup. Then, the configuration process of module A occurs in the startup process of module.
Those who have experience in extended development directly pointed out that the startup of module A is not in PHP_MINIT_FUNCTION (?
Yes. If module A needs to be configured, you can call REGISTER_INI_ENTRIES () in PHP_MINIT_FUNCTION. REGISTER_INI_ENTRIES searches for user-defined configuration values in configuration_hash based on the configuration item names required by the current module, and updates them to the global space of the module.
2.1. global space of the module
To understand how to apply the ini configuration from configuration_hash to each module, you must first understand the global space of the php module. For different php modules, you can open up a storage space of your own, and this space is globally visible to this module. Generally, it is used to store the ini configurations required by the module. That is to say, the configuration items in configuration_hash will be stored in the global space. During the execution of the module, you only need to directly access this global space to get the settings you have set for this module. Of course, it is often used to record the intermediate data of a module during execution.
We use the bcmath module as an example to illustrate that bcmath is a php module that provides mathematical interfaces. First, let's take a look at the ini configurations it has:
PHP_INI_BEGIN ()
STD_PHP_INI_ENTRY ("bcmath. scale", "0", PHP_INI_ALL, OnUpdateLongGEZero, bc_precision, zend_bcmath_globals, bcmath_globals)
PHP_INI_END ()
Bcmath has only one configuration item. we can use bcmath. scale in php. ini to configure the bcmath module.
Next, let's take a look at the global space definition of the bcmatch module. The following statement is contained in php_bcmath.h:
ZEND_BEGIN_MODULE_GLOBALS (bcmath)
Bc_num _ zero _;
Bc_num _ one _;
Bc_num _ two _;
Long bc_precision;
ZEND_END_MODULE_GLOBALS (bcmath)
After the macro scale is launched, it is:
Typedef struct _ zend_bcmath_globals {
Bc_num _ zero _;
Bc_num _ one _;
Bc_num _ two _;
Long bc_precision;
} Zend_bcmath_globals;
In fact, the zend_bcmath_globals type is the global space type in the bcmath module. Here, we only declare the zend_bcmath_globals struct. in bcmath. c, we also have specific instantiation definitions:
// After expansion, zend_bcmath_globals bcmath_globals is used;
ZEND_DECLARE_MODULE_GLOBALS (bcmath)
We can see that ZEND_DECLARE_MODULE_GLOBALS is used to define the variable bcmath_globals.
Bcmath_globals is a real global space. It contains four fields. The last field bc_precision corresponds to bcmath. scale in ini configuration. We set the value of bcmath. scale in php. ini. Then, when the bcmath module is started, the value of bcmath. scale is updated to bcmath_globals.bc_precision.
Update the value in configuration_hash to the xxx_globals variable defined by each module. this is the so-called function of ini configuration to the module. Once the module is started, these configurations are also in place. Therefore, in the subsequent execution phase, the php module does not need to access configuration_hash again. The module only needs to access its own XXX_globals to obtain the user-defined configuration.
In bcmath_globals, in addition to one field being the ini configuration item, what do other three fields mean? This is the second role of the module's global space. in addition to the ini configuration, it can also store some data during the module's execution.
For example, the json module is also a common module in php:
ZEND_BEGIN_MODULE_GLOBALS (json)
Int error_code;
ZEND_END_MODULE_GLOBALS (json)
As you can see, the json module does not require ini configuration. its global space has only one field error_code. Error_code records the errors that occurred in the last execution of json_decode or json_encode. The json_last_error function returns this error_code to help you locate the cause of the error.
In order to easily access the global space variables of the module, some macros are proposed in php. For example, if you want to access error_code in json_globals, you can directly write json_globals.error_code (not in a multi-threaded environment). However, a more general syntax is to define the JSON_G macro:
# Define JSON_G (v) (json_globals.v)
We use JSON_G (error_code) to access json_globals.error_code. At the beginning of this article, we mentioned PG, BG, JSON_G, PCRE_G, and XXX_G. These macros are also common in php source code. Now we can easily understand them. PG macros can access the global variables of the Core module, BG can access the global variables of the Standard module, and PCRE_G can access the global variables of the PCRE module.
# Define PG (v) (core_globals.v)
# Define BG (v) (basic_globals.v)
2.2. how do I determine the configurations required for a module?
The INI configurations required by the module are defined in each module. For example, the Core module has the following configuration items:
PHP_INI_BEGIN ()
......
STD_PHP_INI_ENTRY_EX ("display_errors", "1", PHP_INI_ALL, OnUpdateDisplayErrors, display_errors, php_core_globals, core_globals, display_errors_mode)
STD_PHP_INI_BOOLEAN ("enable_dl", "1", PHP_INI_SYSTEM, OnUpdateBool, enable_dl, php_core_globals, core_globals)
STD_PHP_INI_BOOLEAN ("expose_php", "1", PHP_INI_SYSTEM, OnUpdateBool, expose_php, php_core_globals, core_globals)
STD_PHP_INI_BOOLEAN ("safe_mode", "0", PHP_INI_SYSTEM, OnUpdateBool, safe_mode, php_core_globals, core_globals)
......
PHP_INI_END ()
The above code can be found in more than 450 lines in the php-src \ main. c file. Many macros are involved, including ZEND_INI_BEGIN, ZEND_INI_END, PHP_INI_ENTRY_EX, and STD_PHP_INI_BOOLEAN. This article will not describe them one by one, and interested readers can analyze them by themselves.
The above code is expanded to the following:
Static const zend_ini_entry ini_entries [] = {
..
{0, comment, "display_errors", sizeof ("display_errors"), comment, (void *) XtOffsetOf (php_core_globals, display_errors), (void *) & core_globals, NULL, "1", sizeof ("1")-1, NULL, 0, 0, 0, display_errors_mode },
{0, priority, "enable_dl", sizeof ("enable_dl"), OnUpdateBool, (void *) XtOffsetOf (php_core_globals, enable_dl), (void *) & core_globals, NULL, "1", sizeof ("1")-1, NULL, 0, 0, 0, zend_ini_boolean_displayer_cb },
{0, expires, "expose_php", sizeof ("expose_php"), OnUpdateBool, (void *) XtOffsetOf (php_core_globals, expose_php), (void *) & core_globals, NULL, "1", sizeof ("1")-1, NULL, 0, 0, 0, zend_ini_boolean_displayer_cb },
{0, expires, "safe_mode", sizeof ("safe_mode"), OnUpdateBool, (void *) XtOffsetOf (php_core_globals, safe_mode), (void *) & core_globals, NULL, "0", sizeof ("0")-1, NULL, 0, 0, 0, zend_ini_boolean_displayer_cb },
...
{0, 0, NULL, 0, NULL, 0, NULL}
};
We can see that the definition of a configuration item is essentially an array of the zend_ini_entry type. The field meanings of the zend_ini_entry struct are as follows:
Struct _ zend_ini_entry {
Int module_number; // module id
Int modifiable; // modifiable range, such as php. ini and ini_set
Char * name; // configuration item name
Uint name_length;
ZEND_INI_MH (* on_modify); // callback function, called when the configuration item is registered or modified
Void * mh_arg1; // usually the offset of the configuration item field in XXX_G
Void * mh_arg2; // usually XXX_G
Void * mh_arg3; // usually reserved field, rarely used
Char * value; // The value of the configuration item
Uint value_length;
Char * orig_value; // original value of the configuration item
Uint orig_value_length;
Int orig_modifiable; // The original modifiable of the configuration item
Int modified; // whether the modification has occurred. if any, orig_value will save the value before the modification.
Void (* displayer) (zend_ini_entry * ini_entry, int type );
};
2.3. apply the configuration to the module -- REGISTER_INI_ENTRIES
REGISTER_INI_ENTRIES is often seen in different extensions of PHP_MINIT_FUNCTION. REGISTER_INI_ENTRIES is mainly responsible for completing two tasks. first, fill in the global space XXX_G of the module and synchronize the value in configuration_hash to XXX_G. Second, it also generates EG (ini_directives ).
REGISTER_INI_ENTRIES is also a macro. After expansion, it is actually the zend_register_ini_entries method. For details, see the implementation of zend_register_ini_entries:
ZEND_API int zend_register_ini_entries (const zend_ini_entry * ini_entry, int module_number TSRMLS_DC )/*{{{*/
{
// Ini_entry is a zend_ini_entry array, and p is the pointer of each item in the array.
Const zend_ini_entry * p = ini_entry;
Zend_ini_entry * hashed_ini_entry;
Zval default_value;
// EG (ini_directives) is registered_zend_ini_directives
HashTable * directives = registered_zend_ini_directives;
Zend_bool config_directive_success = 0;
// Do you still remember that the last ini_entry is fixed to {0, 0, NULL ,...}?
While (p-> name ){
Config_directive_success = 0;
// Add zend_ini_entry pointed to by p to EG (ini_directives)
If (zend_hash_add (directives, p-> name, p-> name_length, (void *) p, sizeof (zend_ini_entry), (void **) & hashed_ini_entry) = FAILURE ){
Zend_unregister_ini_entries (module_number TSRMLS_CC );
Return FAILURE;
}
Hashed_ini_entry-> module_number = module_number;
// Query in configuration_hash based on name, and put the obtained result in default_value.
// Note that the value of default_value is relatively primitive. it is generally a number, string, array, etc., depending on the method in php. ini.
If (zend_get_configuration_directive (p-> name, p-> name_length, & default_value) = SUCCESS ){
// Call on_modify to update to the global space XXX_G of the module.
If (! Response-> on_modify | response-> on_modify (response, Z_STRVAL (default_value), Z_STRLEN (default_value), response-> mh_arg1, response-> mh_arg2, response-> mh_arg3, ZEND_INI_STAGE_STARTUP TSRMLS_CC) = SUCCESS ){
Hashed_ini_entry-> value = Z_STRVAL (default_value );
Hashed_ini_entry-> value_length = Z_STRLEN (default_value );
Config_directive_success = 1;
}
}
// If the parameter is not found in configuration_hash, the default value is used.
If (! Config_directive_success & hashed_ini_entry-> on_modify ){
Hashed_ini_entry-> on_modify (region, region-> value, hashed_ini_entry-> value_length, region-> mh_arg1, region-> mh_arg2, region-> mh_arg3, ZEND_INI_STAGE_STARTUP TSRMLS_CC );
}
P ++;
}
Return SUCCESS;
}
Simply put, the logic of the above code can be expressed:
1. add the ini configuration item declared by the module to EG (ini_directives. Note that the ini configuration item value may be modified later.
2. try to find the ini required by each module in configuration_hash.
If this parameter can be found, it indicates that the value is configured in the INI file, and the user configuration is used.
If OK is not found, it does not matter because the module will bring the default value when declaring the ini.
3. synchronize the ini value to XX_G. After all, in the php execution process, XXX_globals still works. The specific process is to call the on_modify method corresponding to each ini configuration. on_modify is specified when the module declares the ini.
Let's take a look at on_modify, which is actually a function pointer. let's look at the configuration declaration of two specific Core modules:
STD_PHP_INI_BOOLEAN ("log_errors", "0", PHP_INI_ALL, OnUpdateBool, log_errors, php_core_globals, core_globals)
STD_PHP_INI_ENTRY ("log_errors_max_len", "1024", PHP_INI_ALL, OnUpdateLong, log_errors_max_len, php_core_globals, core_globals)
For log_errors, its on_modify is set to OnUpdateBool. for log_errors_max_len, on_modify is set to OnUpdateLong.
Let us assume that the configuration in php. ini is as follows:
Log_errors = On
Log_errors_max_len = 1024
For details, refer to the OnUpdateBool function:
ZEND_API ZEND_INI_MH (OnUpdateBool)
{
Zend_bool * p;
// Base indicates the address of core_globals
Char * base = (char *) mh_arg2;
// P indicates the offset of the log_errors field added to the core_globals address.
// Obtain the address of the log_errors field.
P = (zend_bool *) (base + (size_t) mh_arg1 );
If (new_value_length = 2 & strcasecmp ("on", new_value) = 0 ){
* P = (zend_bool) 1;
}
Else if (new_value_length = 3 & strcasecmp ("yes", new_value) = 0 ){
* P = (zend_bool) 1;
}
Else if (new_value_length = 4 & strcasecmp ("true", new_value) = 0 ){
* P = (zend_bool) 1;
}
Else {
// The value stored in configuration_hash is the string "1" instead of "On"
// Use atoi to convert it to number 1.
* P = (zend_bool) atoi (new_value );
}
Return SUCCESS;
}
The most puzzling estimates are mh_arg1 and mh_arg2. In fact, according to the definition of zend_ini_entry described above, mh_arg1 and mh_arg2 are easy to understand. Mh_arg1 indicates the byte offset, and mh_arg2 indicates the address of XXX_globals. Therefore, the result of (char *) mh_arg2 + mh_arg1 is the address of a field in XXX_globals. In this case, the log_errors address in core_globals is calculated. Therefore, when OnUpdateBool is last executed
* P = (zend_bool) atoi (new_value );
The function is equivalent
Core_globals.log_errors = (zend_bool) atoi ("1 ");
After OnUpdateBool is analyzed, we can see OnUpdateLong at a glance:
ZEND_API ZEND_INI_MH (OnUpdateLong)
{
Long * p;
Char * base = (char *) mh_arg2;
// Obtain the log_errors_max_len address
P = (long *) (base + (size_t) mh_arg1 );
// Convert "1024" to long type and assign it to core_globals.log_errors_max_len
* P = zend_atol (new_value, new_value_length );
Return SUCCESS;
}
Note that, in the zend_register_ini_entries function, if configuration exists in configuration_hash, the value and value_length in hashed_ini_entry will be updated after on_modify is called. That is to say, if you have configured it in php. ini, the actual configuration value is stored in EG (ini_directives. If the user is not configured with zend_ini_entry, the default value is stored in EG (ini_directives.
The default_value variable in zend_register_ini_entries has a bad name, which may cause misunderstanding. In fact, default_value does not represent the default value, but the value actually configured by the user.
3. Summary
So far, the three pieces of data configuration_hash, EG (ini_directives), PG, BG, PCRE_G, JSON_G, XXX_G... have been clearly explained.
Summary:
1. configuration_hash: stores the configuration in the php. ini file without verification. its value is a string.
2. for example (ini_direves VES), the zend_ini_entry defined in each module is stored. if ini is configured (already exists in configuration_hash), the value is replaced with the value in configuration_hash, and the type is still a string.
3, XXX_G. the macro is used to access the global space of the module. this memory space can be used to store the ini configuration and updated using the function specified by on_modify, the data type is determined by the field declaration in XXX_G.
Summary (1) This article does not detail the purpose of an ini configuration item, which is fully covered in the manual. I just want to dig from a specific angle...