PHP Kernel Introduction and Extension Development Guide-Basic Knowledge _php Tutorial

Source: Internet
Author: User
Tags php source code
First, the basic knowledge
This chapter briefly describes the internal mechanisms of some Zend engines, which are closely related to extensions, and can help us write more efficient PHP code.
1.1 The storage of PHP variables
1.1.1 Zval Structure
Zend uses the ZVAL structure to store the value of the PHP variable, as shown in the following structure:
Copy CodeThe code is as follows:
typedef Union _ZVALUE_VALUE {
Long lval; /* Long value */
Double Dval; /* Double Value */
struct {
Char *val;
int Len;
} str;
HashTable *ht; /* Hash Table value */
Zend_object_value obj;
} Zvalue_value;
struct _zval_struct {
/* Variable Information */
Zvalue_value value; /* Value */
Zend_uint RefCount;
Zend_uchar type; /* Active type */
Zend_uchar Is_ref;
};
typedef struct _ZVAL_STRUCT Zval;
Zend determines which member of the value is accessed based on the type value, the available values are as follows:

is_nulln/a

Is_long Correspondence Value.lval

Is_double Correspondence Value.dval

Is_string Correspondence Value.str

Is_array Correspondence value.ht

Is_object Correspondence Value.obj

Is_bool corresponds to Value.lval.

Is_resource Correspondence Value.lval

Here are two interesting places to find: First, the PHP array is actually a hashtable, which explains why PHP supports associative arrays, and secondly, resource is a long value, which is usually a pointer, An index of an internal array or something that only the creator knows for itself can be seen as a handle

1.1.1 Reference count

Reference counting is widely used in places such as garbage collection, memory pools, and strings, and Zend implements a typical reference count. Multiple PHP variables can be shared by using the reference counting mechanism to share the remaining two members of the same zval,zval Is_ref and refcount to support this sharing.

Obviously, RefCount is used for counting, and when the reference is added or subtracted, the value is incremented and decremented accordingly, and once it is reduced to zero, Zend reclaims the zval.

What about Is_ref?

1.1.2 Zval Status

In PHP, there are two types of variables-reference and non-reference, which are stored in Zend in the form of reference counts. For non-reference variables, it is required that the variables are irrelevant, modifying a variable without affecting other variables, and using the copy-on-write mechanism to resolve the conflict-when attempting to write a variable, Zend discovers that the zval that the variable points to is shared by multiple variables. It copies a copy of RefCount 1 Zval and decrements the refcount of the original Zval, which is called "Zval separation". However, for reference variables, which are required to be the opposite of non-reference types, the variables that reference the assignment must be bundled, and modifying a variable modifies all the bundle variables.

Visible, it is necessary to point out the current state of the zval, in order to deal with both cases, IS_REF is the purpose, it points out whether the current pointer to the Zval is the use of reference assignment-or all references, or all is not. At this point, a variable is modified, and Zend executes copy-on-write only if it finds that its zval is_ref is 0, or non-referenced.

1.1.3 Zval Status Switch

When all assignment operations on a zval are references or are non-references, a is_ref is sufficient. However, the world will not be so good, PHP can not make this restriction to users, when we mix the use of reference and non-reference assignment, we have to do special processing.

Situation I, see the following PHP code:





The whole process is as follows:

The first three sentences of this code will point A, B and C to a zval, its is_ref=1, refcount=3; the sentence is a non-reference assignment, usually only need to increase the reference count, but the target zval is a reference variable, simply increase the reference count is obviously wrong, The workaround for Zend is to generate a separate copy of the zval for D.

The whole process is as follows:

1.1.1 Parameter passing

PHP function parameters are passed in the same way as variable assignments, non-reference passes are equivalent to non-reference assignments, reference passes are equivalent to reference assignments, and can also cause zval state transitions to be performed. This will also be mentioned in the following.

1.2 Hashtable structure

Hashtable is the most important and widely used data structure in the Zend engine, which is used to store almost everything.

1.1.1 Data structure

The hashtable data structure is defined as follows:
Copy CodeThe code is as follows:
typedef struct BUCKET {
ULONG H; Store Hash
UINT Nkeylength;
void *pdata; Point to value, which is a copy of the user's data
void *pdataptr;
struct bucket *plistnext; Plistnext and Plistlast composition
struct bucket *plistlast; The entire Hashtable double-linked list
struct bucket *pnext; Pnext and Plast are used to form a hash counterpart
struct bucket *plast; The doubly linked list
Char arkey[1]; Key
} buckets;
typedef struct _HASHTABLE {
UINT Ntablesize;
UINT Ntablemask;
UINT Nnumofelements;
ULONG Nnextfreeelement;
Bucket *pinternalpointer; /* Used for element traversal */
Bucket *plisthead;
Bucket *plisttail;
Bucket **arbuckets; Hash array
dtor_func_t Pdestructor; Hashtable is specified when initializing, and is called when the bucket is destroyed
Zend_bool persistent; Whether to use C's memory allocation routines
unsigned char napplycount;
Zend_bool bapplyprotection;
#if Zend_debug
int inconsistent;
#endif
} HashTable;

In general, Zend's Hashtable is a list hash and is optimized for linear traversal, as shown here:


The hashtable contains two data structures, a list hash and a doubly linked list, which are used for fast key-value queries, which facilitate linear traversal and sorting, while a bucket exists in both data structures.
a few explanations of this data structure:
Why do I use a doubly linked list in a linked list hash?
The general list hash only needs to be operated by key, only a single linked list is sufficient. However, Zend sometimes needs to remove a given bucket from the list hash, which can be implemented very efficiently using a doubly linked list.
What is L ntablemask for?
This value is used for the conversion of the hash value to the subscript of the arbuckets array. When initializing a hashtable,zend first allocates ntablesize-sized memory for the arbuckets array, ntablesize the smallest 2^n, which is not smaller than the user-specified size, the 10* of the binary. Ntablemask = ntablesize–1, the binary 01*, at which point H & Ntablemask falls exactly in [0, ntablesize–1], and Zend accesses the Arbuckets array as index.
What is L pdataptr for?
Typically, when a user inserts a key-value pair, Zend copies the value and points pdata to the value copy. The copy operation needs to call the Zend internal routine emalloc to allocate memory, which is a time-consuming operation and consumes a chunk of memory larger than value (the extra memory is used to store the cookie), which can be wasteful if value is small. Given that Hashtable is used to hold pointer values, Zend introduces Pdataptr, and when value is as small as a pointer, Zend copies it directly into Pdataptr and points pdata to pdataptr. This avoids the emalloc operation and also helps to increase the cache hit rate.
Arkey size Why is it only 1? Why not use pointers to manage keys?
Arkey is an array of keys, but its size is only 1, not enough to drop the key. The following code can be found in the initialization function of Hashtable:
1p = (bucket *) pemalloc (sizeof (bucket)-1 + nkeylength, ht->persistent);
As you can see, Zend allocates a bucket of memory that is sufficient to put down its own and key.
The upper half is the bucket, the lower part is the key, and the Arkey "just" is the last element of the bucket, so you can use Arkey to access the key. This approach is most common in memory management routines, when allocating memory, is actually allocated a larger than the specified size of memory, the upper half of the extra part is often referred to as a cookie, it stores the memory information, such as the block size, the last piece of the pointer, the next pointer, etc. This method is used by Baidu's transmit program.
No pointers are used to manage key, to reduce the emalloc operation and to increase the cache hit rate. Another necessary reason is that the key is fixed in most cases and does not cause the whole bucket to be redistributed because the key is longer. This also explains why the value is not assigned together as an array-because value is mutable.
1.2.2 PHP Arrays
About Hashtable there is a question unanswered, is nnextfreeelement what?
Unlike the general hash, the Zend Hashtable allows the user to specify the hash value directly, ignoring the key, or even without specifying key (at which point the nkeylength is 0). At the same time, Hashtable also support append operation, the user even hash value is not specified, only need to provide value, at this time, Zend with Nnextfreeelement as a hash, then nnextfreeelement increment.
This behavior of Hashtable looks strange because it will not be able to access value by key, it is not a hash at all. The key to understanding the problem is that the PHP array is implemented using Hashtable-associative arrays use normal k-v mappings to add elements to the Hashtable, whose key is the user-specified string, and the non-associative array directly uses the array subscript as the hash value, and no key exists; When you mix associations and non-associations in an array, or use array_push operations, you need to use nnextfreeelement.
Then look at the value of the value,php array directly using the Zval this general structure, pdata points to zval*, as described in the previous section, this zval* will be stored directly in Pdataptr. Because Zval is used directly, the elements of an array can be any PHP type.
The traversal operation of the array, foreach, each, etc., is done through the Hashtable doubly linked list, and the current position is recorded as a cursor pinternalpointer.
1.2.3 Variable Symbol table
In addition to arrays, Hashtable is also used to store many other data, such as PHP functions, variable symbols, loaded modules, class members, and so on.
A variable symbol table is equivalent to an associative array whose key is the variable name (it is not a good idea to use a long variable name), and value is zval*.
At any one time, PHP code can see two variable symbol table--symbol_table and active_symbol_table--used to store global variables, called global symbol table, the latter is a pointer to the currently active variable symbol table, usually the global symbol table. However, each time a PHP function is entered (this refers to a function created by the user using PHP code), Zend creates a variable symbol table for the function local and points the active_symbol_table to the local symbol table. Zend always uses active_symbol_table to access variables, which enables scope control of local variables.
However, if a variable marked as global is accessed locally in the function, Zend will perform special processing--a reference to the variable with the same name in the symbol_table is created in active_symbol_table, and if there is no variable with the same name in Symbol_table, it is created first.
1.3 Memory and files
Programs have resources that typically include memory and files, which are process-oriented for the usual programs, and when the process is finished, the operating system or the C library automatically reclaims those resources that we did not explicitly release.
However, the PHP program has its own particularity, it is based on the page, a page run will also apply for memory or file such resources, however, when the page runs, the operating system or C library may not know the need for resource recycling. For example, we compile PHP as a module into Apache and run Apache in Prefork or worker mode. In this case, the Apache process or thread is reused, and the memory allocated by the PHP page will remain in memory until the core is out.
To solve this problem, Zend provides a set of memory allocation APIs that function in the same way as the corresponding functions in C, unlike the functions that allocate memory from Zend's own pool of memory, and they can implement automatic page-based recycling. In our module, the memory allocated for the page should use these APIs, not the C routines, otherwise Zend will try to efree out our memory at the end of the page, which is usually the result of crush.
Emalloc ()
Efree ()
Estrdup ()
Estrndup ()
Ecalloc ()
Erealloc ()
In addition, Zend provides a set of macros such as vcwd_xxx to replace the C library and the corresponding file API for the operating system, which can support the virtual working directory of PHP and should always be used in the module code. See the PHP source code "Tsrm/tsrm_virtual_cwd.h" for a specific definition of the macro. You might notice that all of those macros do not provide a close action because the close object is an open resource and does not involve a file path, so you can use C or operating system routines directly, as well, such operations as Read/write are also routines that use C or the operating system directly.

http://www.bkjia.com/PHPjc/324258.html www.bkjia.com true http://www.bkjia.com/PHPjc/324258.html techarticle I. Basic knowledge This chapter briefly describes the internal mechanisms of some Zend engines, which are closely related to extensions and can help us write more efficient PHP code. 1.1 PHP Change ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.