52-Using the hash table API

Source: Internet
Author: User
Tags array example

52-Using the hash table API

Zend The Hashtable-related APIs into several categories for us to find, and most of the return values for these APIs are constant success or failure.

Create Hashtable

The following describes the function prototypes using the HT name, but when we write the extension, do not use this name, because some PHP macro expansion will declare the name of the variable, and then raise the naming conflict.

Creating and initializing a Hashtable is very simple, as long as you use the Zend_hash_init function, it is defined as follows:

int zend_hash_init(    HashTable *ht,    uint nSize,    hash_func_t pHashFunction,    dtor_func_t pDestructor,    zend_bool persistent);

*HT is a pointer to a hashtable, we can & an existing Hashtable variable, but also through the Emalloc (), Pemalloc () and other functions to directly request a piece of memory, but the most common method is still used ALLOC_ HASHTABLE (HT) macro to let the kernel automatically do this for us. The work done by alloc_hashtable (HT) is equivalent to HT = Emalloc (sizeof (HASHTABLE));

Nsize represents the maximum number of elements that can be owned by this Hashtable (Hashtable can contain any number of elements, just to get the memory in advance, to improve performance, and to save on rehash operations). When we add a new element, this value will determine whether it grows automatically depending on the situation, and interestingly, this value is always 2, and if you give it a value that is not in the form of a 2, it will automatically adjust to a value greater than its minimum of 2. It's calculated like this: nSize = POW (2, ceil (log (nSize, 2)));

Phashfunction is a parameter in the early Zend engine, in order to be compatible without removing it, but it is useless, so we are directly assigned to NULL. In the original, it is actually a hook, used to let the user hook a hash function, replace the default PHP djbx33a algorithm implementation.

Pdestructor also represents a callback function that is called when we delete or modify one of the elements in Hashtable, and its function prototype must be this: void Method_name (void *pelement); Pelement is a pointer to the data that will be deleted or modified in Hashtable, and the type of data is often a pointer.

Persistent is the last parameter, and its meaning is very simple. If it is true, then the Hashtable will always be in memory and will not be automatically logged off during the Rshutdown phase. At this point the first parameter HT points to the address that must be requested by the Pemalloc () function.

For example, the PHP kernel invokes this function to initialize symbol_table on each request header.

zend_hash_init(&EG(symbol_table), 50, NULL, ZVAL_PTR_DTOR, 0);//#define ZVAL_PTR_DTOR (void (*)(void *)) zval_ptr_dtor_wrapper

Since 50 is not an integer power of 2, it is tuned to 64 when the function is executed.

Add and Modify

We have four commonly used functions to do this, and their prototypes are as follows:

int zend_hash_add(    HashTable *ht,      //待操作的ht    char *arKey,            //索引,如"my_key"    uint nKeyLen,       //字符串索引的长度,如6    void **pData,       //要插入的数据,注意它是void **类型的。int *p,i=1;p=&i,pData=&p;。    uint nDataSize,    void *pDest         //如果操作成功,则pDest=*pData;);int zend_hash_update(    HashTable *ht,    char *arKey,    uint nKeyLen,    void *pData,    uint nDataSize,    void **pDest);int zend_hash_index_update(    HashTable *ht,    ulong h,    void *pData,    uint nDataSize,    void **pDest);int zend_hash_next_index_insert(    HashTable *ht,    void *pData,    uint nDataSize,    void **pDest);

The first two function users add data with a string index into Hashtable, as we used in PHP: $foo [' bar '] = ' baz '; using C to complete is:

zend_hash_add(fooHashTbl, "bar", sizeof("bar"), &barZval, sizeof(zval*), NULL);

The only difference between Zend_hash_add () and Zend_hash_update () is that if the key already exists, then Zend_hash_add () will return failure without modifying the original data.

The next two functions are used to add a numeric index to data like HT, and the Zend_hash_next_index_insert () function does not need an index value parameter, but instead computes the next numeric index value directly.

But if we want to get the number index value of the next element, there is a way, you can use the Zend_hash_next_free_element () function:

ulong nextid = zend_hash_next_free_element(ht);zend_hash_index_update(ht, nextid, &data, sizeof(data), NULL);

In all of these functions, if pdest is not NULL, the kernel modifies the address of the element whose value is to be manipulated. This parameter also has the same functionality in the following code.


Because there are two types of index values in Hashtable, two functions are required to perform a find operation.

int zend_hash_find(HashTable *ht, char *arKey, uint nKeyLength,void **pData);int zend_hash_index_find(HashTable *ht, ulong h, void **pData);

The first is when we work with arrays of strings indexed in the PHP language, and the second is when we work with arrays of numeric indices in the PHP language.

void hash_sample(HashTable *ht, sample_data *data1){    sample_data *data2;    ulong targetID = zend_hash_next_free_element(ht);    if (zend_hash_index_update(ht, targetID,            data1, sizeof(sample_data), NULL) == FAILURE) {            /* Should never happen */            return;    }    if(zend_hash_index_find(ht, targetID, (void **)&data2) == FAILURE) {        /* Very unlikely since we just added this element */        return;    }    /* data1 != data2, however *data1 == *data2 */}

In addition to reading, we also need to detect whether a key exists:

int zend_hash_exists(HashTable *ht, char *arKey, uint nKeyLen);int zend_hash_index_exists(HashTable *ht, ulong h);

These two functions return success or failure, respectively, representing whether they exist:

if( zend_hash_exists(EG(active_symbol_table),"foo", sizeof("foo")) == SUCCESS ){    /* $foo is set */}else{    /* $foo does not exist */}
ulong zend_get_hash_value(char *arKey, uint nKeyLen);

We can use the Zend_get_hash_value function to speed up our operations when we need to perform many operations on the same string key, such as detecting and not, then inserting, then modifying, and so on. The return value of this function can be used with the Quick series function for the purpose of acceleration (that is, the hash value of the string is no longer repeated, but is ready to be used directly)!

int zend_hash_quick_add(    HashTable *ht,    char *arKey,    uint nKeyLen,    ulong hashval,    void *pData,    uint nDataSize,    void **pDest);int zend_hash_quick_update(    HashTable *ht,    char *arKey,    uint nKeyLen,    ulong hashval,    void *pData,    uint nDataSize,    void **pDest);int zend_hash_quick_find(    HashTable *ht,    char *arKey,    uint nKeyLen,    ulong hashval,    void **pData);int zend_hash_quick_exists(    HashTable *ht,    char *arKey,    uint nKeyLen,    ulong hashval);

It's a surprise, but you still have to accept the function without Zend_hash_quick_del (). The Quick class function is used in the following situations:

void php_sample_hash_copy(HashTable *hta, HashTable *htb,char *arKey, uint nKeyLen TSRMLS_DC){    ulong hashval = zend_get_hash_value(arKey, nKeyLen);    zval **copyval;    if (zend_hash_quick_find(hta, arKey, nKeyLen,hashval, (void**)?val) == FAILURE)    {        //标明不存在这个索引        return;    }    //这个zval已经被其它的Hashtable使用了,这里我们进行引用计数操作。    (*copyval)->refcount__gc++;    zend_hash_quick_update(htb, arKey, nKeyLen, hashval,copyval, sizeof(zval*), NULL);}
Copy and Merge

In the PHP language, we often need to do the copy and merge between the arrays, so the PHP language array in the C language implementation Hashtable will certainly often encounter this situation. In order to simplify this kind of operation, the kernel has already prepared the corresponding API for us to use.

void zend_hash_copy(    HashTable *target,    HashTable *source,    copy_ctor_func_t pCopyConstructor,    void *tmp,    uint size);
    • All of the elements in the *source are copied to *target through the Pcopyconstructor function, and we also use an array example in the PHP language. Pcopyconstructor This hook allows us to add an additional action to their ref_count when the copy variable is used. The data that is indexed in the target's original and source will be replaced, while the other elements will be left intact.
    • The TMP parameter is intended to be compatible with the previous version of PHP4.0.3 and is now assigned a value of NULL.
    • The size parameter represents the sizes of each element, and for arrays in the PHP language, it is sizeof (zval*).

      void Zend_hash_merge (
      HashTable *target,
      HashTable *source,
      copy_ctor_func_t Pcopyconstructor,
      void *tmp,
      UINT size,
      int overwrite

The only difference between zend_hash_merge () and zend_hash_copy is that there is an int type of overwrite parameter, when its value is not 0, two functions work exactly the same, and if the overwrite parameter is 0, Zend_hash_ The merge function does not replace a value that already has an index in target.

typedef zend_bool (*merge_checker_func_t)(HashTable *target_ht,void *source_data, zend_hash_key *hash_key, void *pParam);void zend_hash_merge_ex(    HashTable *target,    HashTable *source,    copy_ctor_func_t pCopyConstructor,     uint size,    merge_checker_func_t pMergeSource,    void *pParam);

This function is more cumbersome, compared with zend_hash_copy, it has two parameters, the more Pmergesoure callback function allows us to selectively merge, not all merge.


In the PHP language, there are many ways to iterate over an array, and there are many ways to iterate over the hashtable of an array. One of the easiest ways to do this is to use a function that is similar to the Forech statement in the PHP language--zend_hash_apply, which receives a callback function and passes each element of Hashtable to it.

typedef int (*apply_func_t)(void *pDest TSRMLS_DC);void zend_hash_apply(HashTable *ht,apply_func_t apply_func TSRMLS_DC);

Here is another kind of traversal function:

typedef int (*apply_func_arg_t)(void *pDest,void *argument TSRMLS_DC);void zend_hash_apply_with_argument(HashTable *ht,apply_func_arg_t apply_func, void *data TSRMLS_DC);

The above function allows you to pass any number of values to the callback function during traversal, which is useful in some DIY operations.

The above functions have a common convention on the return values of the callback functions passed to them, and the following table is described in detail:

Return value of the callback function

name Notes
Zend_hash_apply_keep Ends the current request and enters the next loop. A loop in the PHP language Forech statement is executed or the Continue keyword is encountered.
Zend_hash_apply_stop Jumps out, the same as the break keyword in the PHP language Forech statement.
Zend_hash_apply_remove Deletes the current element and then resumes processing the next. Equivalent in the PHP language: unset ( Foo[ Key]); continue;

Let's take a look at the Forech loop in the PHP language:

<?phpforeach($arr as $val) {    echo "The value is: $val\n";}?>

Then our callback function should be written in the C language:

int php_sample_print_zval(zval **val TSRMLS_DC){    //重新copy一个zval,防止破坏原数据    zval tmpcopy = **val;    zval_copy_ctor(&tmpcopy);    //转换为字符串    INIT_PZVAL(&tmpcopy);    convert_to_string(&tmpcopy);    //开始输出    php_printf("The value is: ");    PHPWRITE(Z_STRVAL(tmpcopy), Z_STRLEN(tmpcopy));    php_printf("\n");    //毁尸灭迹    zval_dtor(&tmpcopy);    //返回,继续遍历下一个~    return ZEND_HASH_APPLY_KEEP;}

Traverse our Hashtable:

//生成一个名为arrht、元素为zval*类型的HashTablezend_hash_apply(arrht, php_sample_print_zval TSRMLS_CC);

The element saved in Hashtable is not a true final variable, but a pointer to it. Our above traversal function receives a parameter of type zval**.

typedef int (*apply_func_args_t)(void *pDest,int num_args, va_list args, zend_hash_key *hash_key);void zend_hash_apply_with_arguments(HashTable *ht,apply_func_args_t apply_func, int numargs, ...);

In order to be able to receive the value of the index at all times, we must use the third form of zend_hash_apply! Like this in the PHP language:

<?phpforeach($arr as $key => $val){    echo "The value of $key is: $val\n";}?>

To match the zend_hash_apply_with_arguments () function, we need to make a small change to our traversal execution function so that it accepts the index as a parameter:

int php_sample_print_zval_and_key(zval **val,int num_args,va_list args,zend_hash_key *hash_key){    //重新copy一个zval,防止破坏原数据    zval tmpcopy = **val;    /* tsrm_ls is needed by output functions */    TSRMLS_FETCH();    zval_copy_ctor(&tmpcopy);    INIT_PZVAL(&tmpcopy);    //转换为字符串    convert_to_string(&tmpcopy);    //执行输出    php_printf("The value of ");    if (hash_key->nKeyLength)    {        //如果是字符串类型的key        PHPWRITE(hash_key->arKey, hash_key->nKeyLength);    }    else    {        //如果是数字类型的key        php_printf("%ld", hash_key->h);    }    php_printf(" is: ");    PHPWRITE(Z_STRVAL(tmpcopy), Z_STRLEN(tmpcopy));    php_printf("\n");    //毁尸灭迹    zval_dtor(&tmpcopy);    /* continue; */    return ZEND_HASH_APPLY_KEEP;}

To perform a traversal:

zend_hash_apply_with_arguments(arrht,php_sample_print_zval_and_key, 0);

This function receives parameters through the variable parameter properties in the C language. This particular example required no arguments to be passed; For information in extracting variable argument lists from Va_list args, see the POSIX documentation pages for Va_start (), Va_arg (), and Va_end ().

When we examine whether this hash_key is a string type or a numeric type, it is detected by the Nkeylength property, not the Arkey property. This is because the kernel sometimes leaves some dirty data in the Arkey property, but the Nkeylength property is safe and safe to use. Even for an empty string index, it can still be processed. For example: $foo ["] =" Bar ", the value of the index is a null character, but its length is the last null character, so it is 1.

Traverse Hashtable forward

Sometimes we want to be able to traverse an array of data without a callback function, and in order to do this, the kernel deliberately adds a property for each hashtable: the internal pointer (internal pointer). As an example of an array in the PHP language, we have the following function to handle the internal pointer of the hashtable that it corresponds to: Reset (), key (), current (), Next (), Prev (), each (), and end ().

<?php    $arr = array(‘a‘=>1, ‘b‘=>2, ‘c‘=>3);    reset($arr);    while (list($key, $val) = each($arr)) {        /* Do something with $key and $val */    }    reset($arr);    $firstkey = key($arr);    $firstval = current($arr);    $bval = next($arr);    $cval = next($arr);?>

The Zend kernel has a set of functions that operate Hashtable functions similar to the functions described above:

/* reset() */void zend_hash_internal_pointer_reset(HashTable *ht);/* key() */int zend_hash_get_current_key(HashTable *ht,char **strIdx, unit *strIdxLen,ulong *numIdx, zend_bool duplicate);/* current() */int zend_hash_get_current_data(HashTable *ht, void **pData);/* next()/each() */int zend_hash_move_forward(HashTable *ht);/* prev() */int zend_hash_move_backwards(HashTable *ht);/* end() */void zend_hash_internal_pointer_end(HashTable *ht);/* 其他的...... */int zend_hash_get_current_key_type(HashTable *ht);int zend_hash_has_more_elements(HashTable *ht);

The next (), Prev (), and end () functions in the PHP language, after moving the pointer, get the currently referred element and return by calling the Zend_hash_get_current_data () function. Each (), though similar to next (), uses the return value of the Zend_hash_get_current_key () function as its return value.

Now we use another method to implement the above Forech:

void Php_sample_print_var_hash (HashTable *arrht) {for (Zend_hash_internal_pointer_reset (ARRHT);        Zend_hash_has_more_elements (Arrht) = = SUCCESS;        Zend_hash_move_forward (Arrht)) {char *key;        UINT Keylen;        ULONG IDX;        int type;        Zval **ppzval, tmpcopy;        Type = ZEND_HASH_GET_CURRENT_KEY_EX (Arrht, &key, &keylen,&idx, 0, NULL);             if (Zend_hash_get_current_data (Arrht, (void**) &ppzval) = = FAILURE) {/* should never actually fail * Since the key is known to exist.        */Continue;        }//re-copy a Zval to prevent the destruction of the original data tmpcopy = **ppzval;        Zval_copy_ctor (&tmpcopy);        Init_pzval (&tmpcopy);        Convert_to_string (&tmpcopy);        /* Output */php_printf ("The Value of");        if (type = = hash_key_is_string) {/* STRING key/associative */Phpwrite (KEY, Keylen);  } else {/* Numeric Key */          php_printf ("%ld", IDX);        } php_printf ("is:");        Phpwrite (Z_strval (tmpcopy), Z_strlen (tmpcopy));        php_printf ("\ n");    /* Toss out old copy */Zval_dtor (&tmpcopy); }}

The above code you should be able to understand, the only thing that has not been touched may be the return value of the Zend_hash_get_current_key () function.

When we traverse a hashtable, it is generally difficult to get into the loop of death.

void Php_sample_print_var_hash (HashTable *arrht) {hashposition pos;    For (ZEND_HASH_INTERNAL_POINTER_RESET_EX (Arrht, &pos);    ZEND_HASH_HAS_MORE_ELEMENTS_EX (Arrht, &pos) = = SUCCESS;        ZEND_HASH_MOVE_FORWARD_EX (Arrht, &pos)) {char *key;        UINT Keylen;        ULONG IDX;        int type;        Zval **ppzval, tmpcopy;                                Type = ZEND_HASH_GET_CURRENT_KEY_EX (Arrht, &key, &keylen,        &idx, 0, &pos); if (ZEND_HASH_GET_CURRENT_DATA_EX (Arrht, (void**) &ppzval, &pos) = = FAILURE) {/* Sho Uld never actually fail * Since the key is known to exist.        */Continue;        }/* Duplicate the Zval So, the original ' s contents is not destroyed */tmpcopy = **ppzval;        Zval_copy_ctor (&tmpcopy);        /* Reset RefCount & Convert */Init_pzval (&tmpcopy); Convert_to_string (& tmpcopy);        /* Output */php_printf ("The Value of");        if (type = = hash_key_is_string) {/* STRING key/associative */Phpwrite (KEY, Keylen);        } else {/* Numeric Key */php_printf ("%ld", IDX);        } php_printf ("is:");        Phpwrite (Z_strval (tmpcopy), Z_strlen (tmpcopy));        php_printf ("\ n");    /* Toss out old copy */Zval_dtor (&tmpcopy); }}<p>with These very slight additions, the HashTable ' s true internal pointer is preserved on whatever state it was I Nitially in on entering the function. When it comes to working with internal pointers of userspace variable hashtables (that's, arrays), this extra step would v Ery likely make the difference between whether the scripter ' s code works as expected.</p>

They are two of the data that are used to delete the string index and the numeric index respectively, and return success or failure to indicate success or failure after the operation is complete. Recalling the top narrative, when an element is deleted, the Hashtable's destructor callback function is activated.

void zend_hash_clean(HashTable *ht);void zend_hash_destroy(HashTable *ht);

The former is used to remove all elements in the Hashtable, while the latter destroys the Hashtable itself.

Now let's take a complete look at the creation, addition, and deletion of Hashtable.

int sample_strvec_handler(int argc, char **argv TSRMLS_DC){ HashTable *ht; //分配内存 ALLOC_HASHTABLE(ht); //初始化 if (zend_hash_init(ht, argc, NULL,ZVAL_PTR_DTOR, 0) == FAILURE) { FREE_HASHTABLE(ht); return FAILURE; } //填充数据 while (argc) { zval *value; MAKE_STD_ZVAL(value); ZVAL_STRING(value, argv[argc], 1); argv++; if (zend_hash_next_index_insert(ht, (void**)&value, sizeof(zval*)) == FAILURE) { /* Silently skip failed additions */ zval_ptr_dtor(&value); } } //完成工作 process_hashtable(ht); //毁尸灭迹 zend_hash_destroy(ht); //释放ht 为什么不在destroy里free呢,求解释! FREE_HASHTABLE(ht); return SUCCESS;}

Sort, compare

Many of the Zend APIs for Hashtable operations require a callback function. First let's deal with the problem of comparing the size of elements in Hashtable:

typedef int (*compare_func_t)(void *a, void *b TSRMLS_DC);

This is much like the function required by the Usort function in the PHP language, which compares two values *a with *b, if *a>*b, returns 1, equals returns 0, otherwise returns-1. Here is the declaration of the Zend_hash_minmax function, which requires the type of function we declared above as a callback function: int Zend_hash_minmax (HashTable *ht, compare_func_t compar,int Flag, void **pdata tsrmls_dc); The functionality of this function, which we can confirm from its name, is used to compare the size of the elements in the Hashtable. If flag==0 returns the minimum value, the maximum value is returned!

Let's use this function to find the maximum and minimum values (case insensitive ~) for all functions defined by the client based on the function name.

//先定义一个比较函数,作为zend_hash_minmax的回调函数。int fname_compare(zend_function *a, zend_function *b TSRMLS_DC){ return strcasecmp(a->common.function_name, b->common.function_name);}void php_sample_funcname_sort(TSRMLS_D){ zend_function *fe; if (zend_hash_minmax(EG(function_table), fname_compare,0, (void **)&fe) == SUCCESS) { php_printf("Min function: %s\n", fe->common.function_name); } if (zend_hash_minmax(EG(function_table), fname_compare,1, (void **)&fe) == SUCCESS) { php_printf("Max function: %s\n", fe->common.function_name); }}

Zend_hash_compare () may be a callback function whose function is to compare Hashtable as a whole with another Hashtable, if the former is greater than the latter, returns 1, returns 0, otherwise returns-1.

int zend_hash_compare(HashTable *hta, HashTable *htb,compare_func_t compar, zend_bool ordered TSRMLS_DC);

By default it is often the first to judge the number of individual Hashtable elements, the largest number! If the elements are as many as they are, then the first element of their respective is compared. Another important API that needs a callback function is the sort function, which takes the form of a callback function:

typedef void (*sort_func_t)(void **Buckets, size_t numBuckets,size_t sizBucket, compare_func_t comp TSRMLS_DC);

The last parameter, if it is, discards the index-key relationship in Hashtable, and assigns a new numeric key value to the new value of the permutation number. The sort function in the PHP language is implemented as follows:

zend_hash_sort(target_hash, zend_qsort,array_data_compare, 1 TSRMLS_CC);

Array_data_compare is a function that returns data of type compare_func_t, which is sorted by the size of the zval* value in Hashtable.

52-Using the hash table API

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.