Introduction to the hash algorithm of php

Source: Internet
Author: User

Introduction to the hash algorithm of php

PHP Hash is currently the most common DJBX33A (Daniel J. bernstein, Times 33 with Addition), this algorithm is widely used in multiple software projects, such as Apache, Perl and Berkeley DB. This is the best hash algorithm known for strings, because the algorithm is fast and has good classification (with low conflicts and even distribution)

Hash Table is the core of PHP.

 

PHP arrays, associated arrays, object attributes, function tables, symbol tables, and so on all use HashTable as the container.

 

PHP HashTable adopts the zipper method to solve conflicts. I don't need to mention this. Today I am mainly concerned with the PHP Hash algorithm and some ideas revealed by this algorithm.

 

PHP Hash is currently the most common DJBX33A (Daniel J. bernstein, Times 33 with Addition), this algorithm is widely used in multiple software projects, such as Apache, Perl and Berkeley DB. this is currently the best hash algorithm known for strings, because the algorithm is fast and has a good classification (small conflicts and even distribution ).

 

The core idea of an algorithm is:

The Code is as follows:

Hash (I) = hash (I-1) * 33 + str [I]

 

 

In zend_hash.h, we can find this algorithm in PHP:

 

The Code is as follows:

Static inline ulong zend_inline_hash_func (char * arKey, uint nKeyLength)

{

Register ulong hash = 5381;

 

/* Variant with the hash unrolled eight times */

For (; nKeyLength> = 8; nKeyLength-= {

Hash = (hash <5) + hash) + * arKey ++;

Hash = (hash <5) + hash) + * arKey ++;

Hash = (hash <5) + hash) + * arKey ++;

Hash = (hash <5) + hash) + * arKey ++;

Hash = (hash <5) + hash) + * arKey ++;

Hash = (hash <5) + hash) + * arKey ++;

Hash = (hash <5) + hash) + * arKey ++;

Hash = (hash <5) + hash) + * arKey ++;

}

Switch (nKeyLength ){

Case 7: hash = (hash <5) + hash) + * arKey ++;/* fallthrough ...*/

Case 6: hash = (hash <5) + hash) + * arKey ++;/* fallthrough ...*/

Case 5: hash = (hash <5) + hash) + * arKey ++;/* fallthrough ...*/

Case 4: hash = (hash <5) + hash) + * arKey ++;/* fallthrough ...*/

Case 3: hash = (hash <5) + hash) + * arKey ++;/* fallthrough ...*/

Case 2: hash = (hash <5) + hash) + * arKey ++;/* fallthrough ...*/

Case 1: hash = (hash <5) + hash) + * arKey ++; break;

Case 0: break;

EMPTY_SWITCH_DEFAULT_CASE ()

}

Return hash;

}

 

 

Compared with the classic Times 33 algorithm directly used in Apache and Perl:

 

The Code is as follows:

Hashing function used in Perl 5.005:

# Return the hashed value of a string: $ hash = perlhash ("key ")

# (Defined by the PERL_HASH macro in hv. h)

Sub perlhash

{

$ Hash = 0;

Foreach (split //, shift ){

$ Hash = $ hash * 33 + ord ($ _);

}

Return $ hash;

}

 

 

In the hash algorithm of PHP, we can see that the difference is very detailed.

 

First of all, the most difference is that PHP does not use directly multiplication 33, but uses:

 

 

The Code is as follows:

Hash <5 + hash

 

 

This will certainly be faster than multiplication.

 

Then, the special idea is to use the unrolled. I have read an article about the Discuz cache mechanism a few days ago. One of them is that Discuz will adopt different caching policies based on the popularity of the post, according to user habits, only the first page of the post is cached (because few people will flip the post ).

 

In this similar idea, PHP encourages the 8-bit Character index. He uses unrolled in 8 units to improve efficiency, which is also very detailed and meticulous.

 

There are also inline, register variables... It can be seen that PHP developers are also painstaking in hash optimization.

 

Finally, the hash initial value is set to 5381. Why do we choose 5381 compared to the times Algorithm in Apache and the Hash algorithm in Perl (both use the initial hash value 0? I don't know the specific cause, but I found some features of 5381:

 

The Code is as follows:

Magic Constant 5381:

1. odd number

2. prime number

3. deficient number

 

After reading this, I have reason to believe that the selection of this initial value can provide better classification.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.