Introduction to the hash algorithm of PHP

Source: Internet
Author: User
Tags arrays hash

The

 php hash uses the most common djbx33a (Daniel J. Bernstein, Times and addition), and this algorithm is widely used with multiple software projects, Apache, Perl and Berkeley DB, and so on. This is the best hash algorithm currently known for strings, because the algorithm is very fast, and the classification is very good (conflict is small, evenly distributed)

Hash table is the core of PHP, which is not too much.   PHP arrays, associative arrays, object properties, function tables, symbol tables, and so on are all used as containers for Hashtable.   PHP Hashtable adopted the Zipper method to resolve the conflict, this since needless to say, my main concern today is the PHP hash algorithm, and the algorithm itself revealed some ideas.   PHP's hash uses the most common djbx33a (Daniel J. Bernstein, Times and addition), and this algorithm is widely used with multiple software projects, Apache, Perl and Berkeley D B and so on. This is the best hash algorithm currently known for strings, because the algorithm is very fast, and the classification is very good (conflict is small, evenly distributed). The core idea of   algorithm is: The   code is as follows: hash (i) = hash (i-1) * + str[i]     in zend_hash.h, we can find this algorithm in PHP:   &nbsp The code is as follows: static inline ulong Zend_inline_hash_func (char *arkey, uint nkeylength) {    register ulong hash = 5381;      /variant with the hash unrolled eight times */    for (; Nkeylength >= 8; nkeylength -=  {        hash = ((hash << 5) + hash) + *arkey++;         hash = ( (hash << 5) + hash) + *arkey++;         hash = ((hash << 5) + hash) + *arkey++;         hash = (Hash << 5)+ hash) + *arkey++;         hash = ((hash << 5) + hash) + *arkey++;         hash = ((hash << 5) + hash) + *arkey++;         hash = ((hash << 5) + hash) + *arkey++;         hash = ((hash << 5) + hash) + *arkey++;    }     switch (nkeylength) {        7:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... *         Case 6:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... *         Case 5:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... *         Case 4:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... *         Case 3:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... *         Case 2:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... * *        Case 1:hash = ((hash << 5) + hash) + *arkey++; Break         case 0:break; Empty_switch_default_case ()    }     return hash; }     Compare the classic times 33 algorithm directly used in Apache and Perl:     Code as follows: Hashing function used in Perl 5.005:   # return The hashed value of a string: $hash = Perlhash ("key")   # (Defined by the Perl_hash macro in Hv.h)   Sub Perlhas H   {      $hash = 0       foreach (split//, shift) {        &N Bsp $hash = $hash *33 + ord ($_);      }       return $hash;  }     in the PHP hash algorithm, we can see that the difference is very meticulous.   First, the most different is that PHP does not use the direct multiply 33, but adopted:     code as follows: Hash << 5 + hash     This will certainly be faster than using.   Then, particularly to the idea is the use of unrolled, I read a few days ago a discuz of the caching mechanism, which is a discuz will be based on the popularity of the post different caching strategy, according to user habits, and only the first page of the cache posts ( Because very few people will flip the post).   In a similar way, PHP encourages a 8-bit character index, and he uses unrolled in 8 to improve efficiency, which has to be said to be very detailed,Very meticulous place.   also has inline, register variable ... You can see that PHP developers in the optimization of the hash is also painstaking   Finally, the original hash is set to 5381, compared to the Times algorithm in Apache and Perl hash algorithm (all using the initial hash of 0), why Choose 5381? I don't know the exact reason, but I found some 5381 features: The   code is as follows: Magic Constant 5381:1. Odd number 2. Prime number 3. Deficient number   See these, I have reason to believe that the selection of this initial value can provide a better classification.  
Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.