Introduction to PHP hash Algorithm _php Example

Source: Internet
Author: User

Hash table is the core of PHP, which is not too much.

PHP arrays, associative arrays, object properties, function tables, symbol tables, and so on are all used as containers for Hashtable.

PHP hashtable use of the Zipper method to resolve the conflict, this since needless to say, my main concern today is the PHP hash algorithm, and the algorithm itself revealed some ideas.

PHP's hash uses the most common djbx33a (Daniel J. Bernstein, Times and addition), and this algorithm is widely used with multiple software projects, Apache, Perl, and Berkeley DB. This is the best hash algorithm currently known for strings, because the algorithm is very fast, and the classification is very good (conflict is small, evenly distributed).

The core idea of the algorithm is:

Copy Code code as follows:

hash (i) = hash (i-1) * + str[i]

In Zend_hash.h, we can find this algorithm in PHP:

Copy Code code as follows:

Static inline ulong Zend_inline_hash_func (char *arkey, uint nkeylength)
{
Register ULONG hash = 5381;

/* Variant with the hash unrolled eight times * *
for (; nkeylength >= 8; nkeylength = {
hash = ((hash << 5) + hash) + *arkey++;
hash = ((hash << 5) + hash) + *arkey++;
hash = ((hash << 5) + hash) + *arkey++;
hash = ((hash << 5) + hash) + *arkey++;
hash = ((hash << 5) + hash) + *arkey++;
hash = ((hash << 5) + hash) + *arkey++;
hash = ((hash << 5) + hash) + *arkey++;
hash = ((hash << 5) + hash) + *arkey++;
}
Switch (nkeylength) {
Case 7:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... * *
Case 6:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... * *
Case 5:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... * *
Case 4:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... * *
Case 3:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... * *
Case 2:hash = ((hash << 5) + hash) + *arkey++; /* Fallthrough ... * *
Case 1:hash = ((hash << 5) + hash) + *arkey++; Break
Case 0:break;
Empty_switch_default_case ()
}
return hash;
}

Compared to the classic times 33 algorithm that is used directly in Apache and Perl:

Copy Code code as follows:

hashing function used in Perl 5.005:
# return the hashed value of a string: $hash = Perlhash (' key ')
# (Defined by the Perl_hash macro in Hv.h)
Sub Perlhash
{
$hash = 0;
foreach (Split//, Shift) {
$hash = $hash *33 + ord ($_);
}
return $hash;
}

In the PHP hash algorithm, we can see the very careful difference.

First, the most different is that PHP does not use the direct multiply 33, but uses:

Copy Code code as follows:

Hash << 5 + hash

Of course it would be quicker than using a ride.

Then, the special idea is to use the unrolled, I read a few days ago that the discuz caching mechanism, one of which is that Discuz will be based on the popularity of the post different caching strategy, according to user habits, and only the first page of the cache posts (because very few people will flip posts).

In a similar way, PHP encourages 8-bit character indexing, and he uses unrolled to increase efficiency by 8, which has to be a very detailed and meticulous place.

In addition, there are inline, register variables ... It can be seen that PHP developers in the optimization of the hash is also painstaking

Finally, the hash is set to the initial value of 5381, compared to the Times in Apache algorithm and the hash algorithm in Perl (all using the initial hash of 0), why Choose 5381? I don't know the exact reason, but I found some of the 5381 features:

Copy Code code as follows:

Magic Constant 5381:
1. Odd number
2. Prime number
3. Deficient number

Having read these, I have reason to believe that the selection of this initial value can provide a better classification.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.