Why you should understand PHP internal structure hashtable
I. Understanding Hashtable
1, the definition of Hashtable
The hash table is a hash of the specified hash function hash (key) that maps to a record in the table, and this array is a Hashtable. Where the hash refers to any function, for example: MD5, CRC32, SHA1, or a custom function.
2, the performance of Hashtable
???? Hashtable is a very high-performance data structure that implements Hashtable within many languages. Ideally, the performance of the Hashtable is O (1), and the performance consumption is mainly focused on hash function hash (key), which is directly located to the record in the table by hash (key). And in the actual situation often occurs key1!=key2, but hash (key1) =hash (Key2), this situation is the hash collision problem, the lower the probability of collision hashtable performance better. Of course, the hash algorithm is too complex can also affect the performance of Hashtable.
3, the application of Hashtable
The PHP kernel also implements Hashtable and is widely used, including thread safety, variable storage, resource management, and so on, where he can be seen in almost all places. In addition, arrays and classes are widely used in PHP scripts. The following focuses on the application of Hashtable in the fields of arrays, variables, functions, and classes.
The application of Hashtable on the array
? ? Most of PHP's functionality is implemented through Hashtable, which includes arrays. Hashtable is the advantage of having a doubly linked list, with the ability to match data to the operating performance. The defined variables in PHP are stored in a symbol table, which is actually a hashtable, and each of its elements is a variable of type zval*. In addition, containers that store user-defined functions, classes, resources, and so on are implemented in the kernel in the form of Hashtable.
? ? The following are the arrays defined in PHP:
$array = Array (), $array ["key"] = "value";
? ? Use macros in the kernel to implement:
zval* array;array_init (array); add_assoc_string (Array, "key", "value", 1);
? ? Expand the macros in the preceding code:
zval* Array; Alloc_init_zval (array); Z_type_p (array) = Is_array; HashTable *h; Alloc_hashtable (h); Z_arrval_p (array) =h; Zend_hash_init (h, N, null,zval_ptr_dtor, 0); zval* Barzval; Make_std_zval (barzval); Zval_string (Barzval, "value", 0); Zend_hash_add (H, "key", 4, &barzval, sizeof (zval*), NULL);
?? Through the code above, we found the application of Hashtable in array. In fact, the array in the PHP kernel is implemented by Hashtable. After the array is initialized, the next element is added to it. Because there are many types of variables in the PHP language, there are several types of add_assoc_* (), add_index_*, add_next_index_* () functions, which correspond to the way we add elements to an array in PHP programming, Where: Add_assoc_* () is an array element that adds the specified Key->value form, and add_index_* () is an element that adds a key to a numeric type; add_next_index_* () is an element that does not specify a key to add. The array allows you to add PHP variables of a compound type, such as resources, objects, arrays, and so on. Let's use an example to illustrate their usage:
Zend_function (Sample_array) {zval *subarray;array_init (return_value);/* Add some scalars */add_assoc_long (return_ Value, "Life", Add_index_bool (Return_value, 123, 1); Add_next_index_double (Return_value, 3.1415926535);/* Toss in a static string, dup ' d by PHP */add_next_index_string (Return_value, "Foo", 1);/* Now a manually dup ' d string */add_next_inde X_string (Return_value, Estrdup ("Bar"), 0);/* Create a subarray */make_std_zval (subarray); Array_init (subarray);/* Populate it with some numbers */add_next_index_long (Subarray, 1); Add_next_index_long (subarray); add_next_index_ Long (subarray);/* Place the Subarray in the parent */add_index_zval (Return_value, 444, subarray);}
? ?? At this point, if our client var_dump this function, the return value will be:
Output Array (6) {["Life"]=> Int ([123]=>) bool (true) [124]=> float (3.1415926535) [125]=> string (3) "Foo" [126 ]=> string (3) "Bar" [444]=> Array (3) {[0]=> int (1) [1]=> int () [2]=> int (300)}}
Iii. Symbolic Table of variables (application of variables)
??? in the previous section on the application of Hashtable in arrays, let's look at how Hashtable is applied in a variable. Here we need to understand two aspects of the problem: one is that variables are variable names and variable values corresponding to the occurrence, then how they are stored? The other is that the variable has a corresponding life cycle, how is this implemented?
? ?? At any one time, PHP code can see two variable symbol table--symbol_table and active_symbol_table--used to store global variables, called global symbol table, the latter is a pointer to the currently active variable symbol table, usually the global symbol table. However, each time a PHP function is entered (this refers to a function created by the user using PHP code), Zend creates a variable symbol table for the function local and points the active_symbol_table to the local symbol table. Zend always uses active_symbol_table to access variables, which enables scope control of local variables.
? ? However, if a variable marked as global is accessed locally in the function, Zend will perform special processing--a reference to the variable with the same name in the symbol_table is created in active_symbol_table, and if there is no variable with the same name in Symbol_table, it is created first.
struct _zend_executor_globals { //slightly HashTable symbol_table;//symbol table for global variables HashTable *active_symbol_table;/ /local variable symbol table //slightly };
You can access the variable symbol table through the EG macro, eg (symbol_table) accesses the variable symbol table of the global scope, eg (active_symbol_table) accesses the variable symbol table for the current scope.
?? The above code is simple, create variable foo, and assign a value bar. The $foo variable can be called in the next PHP code. Now look at the variables defined in PHP and how they are implemented in the kernel. Pseudo code:
zval* foo; Make_std_zval (foo); Zval_string (foo, "Bar", 1); Zend_set_symbol (EG (active_symbol_table), "foo", Foo);
1th step: Create a ZVAL structure and set the type.
2nd Step: Assign value to bar.
3rd Step: Add it to the current scope symbol table so that the user can use this variable in PHP.
Note: everyone knows that when PHP scripts are executed, the user global variables (variables explicitly defined in user space) are stored in a Hashtable data type symbol table (symbol_table), and there are some more special global variables in PHP such as: $_ Get,$_post,$_server variables, we do not define these variables in the program, and these variables are also stored in the symbol table, from these appearances we are not difficult to conclude that PHP is the script before the operation of these special variables added to the symbol table.
Iv. application of Hashtable in class
Like the class and function, both PHP built-in and PHP extensions can implement their own internal classes, or they can be defined by the user using PHP code. Of course, we usually define ourselves when writing code.
????, we use the class keyword for definition, followed by the class name, the class name can be any non-PHP reserved word name. The class name is followed by a pair of curly braces, which are the entities of the class, including the properties that the class has, which are abstractions of the state of the object, which can be expressed as data types supported in PHP, or the object itself, which is often referred to as member variables. In addition to the properties of the class, the entities of the class also include the operations that the class has, which are abstractions of the behavior of the object, which is represented by the operation name and the method that implements the operation, which is commonly referred to as a member method or member function. Look at the code for the class example:
Class ParentClass {} interface Ifce {public function IMethod ();} final class Tipi extends ParentClass implements IFCE {public static $sa = ' AAA '; Const CA = ' BBB '; Public Function __constrct () { } public function IMethod () { } Private Function _access ( ) {} public static function Access () { }}
? ?? This defines a parent class ParentClass, an interface Ifce, a subclass tipi. Subclasses inherit the parent class ParentClass, implement Interface IFCE, and have a static variable $sa, a class constant CA, a common method, a private method, and a public static method. How are these structures implemented inside the Zend engine? let's look at the internal storage structure of the class:
struct _zend_class_entry {char type; Type: Zend_internal_class/zend_user_class char *name;//class name Zend_uint name_length; That is, sizeof (name)-1 struct _zend_class_entry *parent; The inherited parent class int refcount; Reference number Zend_bool constants_updated; Zend_uint Ce_flags; Zend_acc_implicit_abstract_class: Class exists abstract method//Zend_acc_explicit_abstract_class: Add the abstract keyword in front of the class name//Zend_a Cc_final_class//Zend_acc_interface HashTable function_table; Methods HashTable default_properties; Default property HashTable Properties_info; The property information HashTable the static variable HashTable *static_members of the default_static_members;//class itself; Type = = Zend_user_class, fetch &default_static_members; Type = = Zend_interal_class, set to null HashTable constants_table; The constant struct _zend_function_entry *builtin_functions;//method defines the entry union _zend_function *constructor; Union _zend_function *destructor; Union _zend_function *clone; /* Magic Method */ Union _zend_function *__get; Union _zend_function *__set; Union _zend_function *__unset; Union _zend_function *__isset; Union _zend_function *__call; Union _zend_function *__tostring; Union _zend_function *serialize_func; Union _zend_function *unserialize_func; Zend_class_iterator_funcs iterator_funcs;//Iteration/* Class handle */Zend_object_value (*create_object) (Zend_class_entry *class _type tsrmls_dc); Zend_object_iterator * (*get_iterator) (Zend_class_entry *ce, Zval *object, Intby_ref tsrmls_dc); /* Interface for class declaration */INT (*interface_gets_implemented) (Zend_class_entry *iface, Zend_class_entry *class_type TSRMLS_D C); /* Serialize callback function pointer */INT (*serialize) (Zval *object, Unsignedchar**buffer, Zend_uint *buf_len, Zend_serialize_data *data tsrmls_dc); Int (*unserialize) (Zval **object, Zend_class_entry *ce, Constunsignedchar*buf, Zend_uint Buf_len, Zend_unseriali Ze_data *data tsrmls_dc); Zend_class_entry **interfaces; //class implements the interface Zend_uint num_interfaces; The number of interfaces implemented by the class char *filename; The storage file address of the class is the absolute address zend_uint Line_start; The start line of the class definition Zend_uint line_end; The end line of the class definition char *doc_comment; Zend_uint Doc_comment_len; struct _zend_module_entry *module; Module entry where class is located: EG (Current_module)};
? ? We can see that in the implementation of the class, Hashtable is used to store some information about the class, and the properties and methods of the class are stored by Hashtable.
? ? Above we listed the Hashtable in the PHP application of several aspects, you can see Hashtable in the PHP kernel code is widely used, so it is necessary to understand how Hashtable is implemented, which is very helpful for us to understand PHP in depth.