Php-perl Hash Algorithm Implementation (times33 hash algorithm)

Source: Internet
Author: User

Copy codeThe Code is as follows:
APR_DECLARE_NONSTD (unsigned int) apr_hashfunc_default (const char * char_key,
Apr_ssize_t * klen)
{
Unsigned int hash = 0;
Const unsigned char * key = (const unsigned char *) char_key;
Const unsigned char * p;
Apr_ssize_t I;

/*
* This is the popular 'times 33' hash algorithm which is used
* Perl and also appears in Berkeley DB. This is one of the best
* Known hash functions for strings because it is both computed
* Very fast and distributes very well.
*
* The originator may be Dan Bernstein but the code in Berkeley DB
* Cites Chris Torek as the source. The best citation I have found
* Is "Chris Torek, Hash function for text in C, Usenet message
* <27038@mimsy.umd.edu> in comp. lang. c, October, 1990. "in Rich
* Salz's USENIX 1992 paper about INN which can be found
*.
*
* The magic of number 33, I. e. why it works better than extends other
* Constants, prime or not, has never been adequately explained
* Anyone. So I try an explanation: if one experimentally tests all
* Multipliers between 1 and 256 (as I did while writing a low-level
* Data structure library some time ago) one detects that even
* Numbers are not useable at all. The remaining 128 odd numbers
* (Random t for the number 1) work more or less all equally well.
* They all distribute in an acceptable way and this way fill a hash
* Table with an average percent of approx. 86%.
*
* If one compares the chi ^ 2 values of the variants (see
* Bob Jenkins ''hashing Frequently Asked Questions''
* Http://burtleburtle.net/bob/hash/hashfaq.html for a description
* Of chi ^ 2), the number 33 not even has the best value. But
* Number 33 and a few other equally good numbers like 17, 31, 63,
* 127 and 129 have nevertheless a great advantage to the remaining
* Numbers in the large set of possible multipliers: their multiply
* Operation can be replaced by a faster operation based on just one
* Shift plus either a single addition or subtraction operation. And
* Because a hash function has to both distribute good _ and _ has
* Be very fast to compute, those few numbers shoshould be preferred.
*
* -- Ralf S. Engelschall
*/

If (* klen = APR_HASH_KEY_STRING ){
For (p = key; * p; p ++ ){
Hash = hash * 33 + * p;
}
* Klen = p-key;
}
Else {
For (p = key, I = * klen; I --, p ++ ){
Hash = hash * 33 + * p;
}
}
Return hash;
}

Translation of function comments: This is a well-known times33 hash algorithm, which is used by perl and appears in Berkeley DB. it is one of the best known hash algorithms. It has extremely fast computing efficiency and good hash distribution when processing string-based hash. dan Bernstein was the first to propose this algorithm, but the source code is indeed implemented by Clris Torek in Berkeley DB. I found the most accurate quote to say this, "Chris Torek, C-language text hash function, Usenet message <27038@mimsy.umd.edu> in comp. lang. c. May October 1990. "Rich Salz mentioned in his article about INN, which was published in USENIX on April 9, 1992. this article can be found in. 33. Why is it better than other values? No matter whether it is important or not, no one can fully explain the reasons. so here, I will try to explain it. if someone tries to test every number between 1 and 256 (just like an underlying data structure library I wrote some time ago), he will find that, no number is outstanding. the performance of the 128 odd numbers (except 1) is almost the same. Both of them can achieve an acceptable hash distribution with an average distribution rate of about 86%. compare the difference between the 128 odd values (gibbon: Statistical term, indicating the average deviation between a random variable and its mathematical expectation) (see Bob Jenkins's

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.