PHP string hash function algorithm implementation code

Source: Internet
Author: User
  1. function Djbhash ($STR)//0.22
  2. {
  3. $hash = 0;
  4. $n = strlen ($STR);
  5. for ($i = 0; $i < $n; $i + +)
  6. {
  7. $hash + = ($hash <<5) + ord ($str [$i]);
  8. }
  9. return $hash% 701819;
  10. }
  11. function Elfhash ($STR)//0.35
  12. {
  13. $hash = $x = 0;
  14. $n = strlen ($STR);
  15. for ($i = 0; $i < $n; $i + +)
  16. {
  17. $hash = ($hash <<4) + ord ($str [$i]);
  18. if ($x = $hash & 0xf0000000)! = 0)
  19. {
  20. $hash ^= ($x >> 24);
  21. $hash &= ~ $x;
  22. }
  23. }
  24. return $hash% 701819;
  25. }
  26. function Jshash ($STR)//0.23
  27. {
  28. $hash = 0;
  29. $n = strlen ($STR);
  30. for ($i = 0; $i < $n; $i + +)
  31. {
  32. $hash ^= (($hash <<5) + ord ($str [$i]) + ($hash >> 2));
  33. }
  34. return $hash% 701819;
  35. }
  36. function Sdbmhash ($STR)//0.23
  37. {
  38. $hash = 0;
  39. $n = strlen ($STR);
  40. for ($i = 0; $i < $n; $i + +)
  41. {
  42. $hash = Ord ($str [$i]) + ($hash <<6) + ($hash <<16)-$hash;
  43. }
  44. return $hash% 701819;
  45. }
  46. function Aphash ($STR)//0.30
  47. {
  48. $hash = 0;
  49. $n = strlen ($STR);
  50. for ($i = 0; $i < $n; $i + +)
  51. {
  52. if (($i & 1) = = 0)
  53. {
  54. $hash ^= (($hash <<7) ^ ord ($str [$i]) ^ ($hash >> 3));
  55. }
  56. Else
  57. {
  58. $hash ^= (($hash <<11) ^ ord ($str [$i]) ^ ($hash >> 5));
  59. }
  60. }
  61. return $hash% 701819;
  62. }
  63. function Dekhash ($STR)//0.23
  64. {
  65. $n = strlen ($STR);
  66. $hash = $n;
  67. for ($i = 0; $i < $n; $i + +)
  68. {
  69. $hash = (($hash <<5) ^ ($hash >>) ^ ord ($str [$i]);
  70. }
  71. return $hash% 701819;
  72. }
  73. function Fnvhash ($STR)//0.31
  74. {
  75. $hash = 0;
  76. $n = strlen ($STR);
  77. for ($i = 0; $i < $n; $i + +)
  78. {
  79. $hash *= 0X811C9DC5;
  80. $hash ^= Ord ($str [$i]);
  81. }
  82. return $hash% 701819;
  83. }
  84. function Pjwhash ($STR)//0.33
  85. {
  86. $hash = $test = 0;
  87. $n = strlen ($STR);
  88. for ($i = 0; $i < $n; $i + +)
  89. {
  90. $hash = ($hash <<4) + ord ($str [$i]);
  91. if ($test = $hash &-268435456)! = 0)
  92. {
  93. $hash = (($hash ^ ($test >>)) & (~-268435456));
  94. }
  95. }
  96. return $hash% 701819;
  97. }
  98. function Phphash ($STR)//0.34
  99. {
  100. $hash = 0;
  101. $n = strlen ($STR);
  102. for ($i = 0; $i < $n; $i + +)
  103. {
  104. $hash = ($hash <<4) + ord ($str [$i]);
  105. if ($g = ($hash & 0xf0000000))
  106. {
  107. $hash = $hash ^ ($g >> 24);
  108. $hash = $hash ^ $g;
  109. }
  110. }
  111. return $hash% 701819;
  112. }
  113. function Opensslhash ($STR)//0.22
  114. {
  115. $hash = 0;
  116. $n = strlen ($STR);
  117. for ($i = 0; $i < $n; $i + +)
  118. {
  119. $hash ^= (Ord ($str [$i]) << ($i & 0x0f));
  120. }
  121. return $hash% 701819;
  122. }
  123. function Md5hash ($STR)//0.050
  124. {
  125. $hash = MD5 ($STR);
  126. $hash = $hash [0] | ($hash [1] <<8) | ($hash [2] <<16) | ($hash [3] <<24) | ($hash [4] <<32) | ($hash [5] <<40) | ($hash [6] <<48) | ($hash [7] <<56);
  127. return $hash% 701819;
  128. }
Copy Code

Algorithm Description: The comment in the following function is the speed at which my local test executes 1000 times (in s), it can be seen that the Md5hash is the fastest, and much faster than the other functions ... But the algorithm from this function can also be seen, it is only dependent on the first 7 characters of the string after the MD5, that is, if the first 7 characters are the same, then the hash value obtained is exactly the same, so the actual distribution is not very trustworthy .... If the speed is calculated according to 32 characters, it is much slower than other algorithms ...

In addition to Md5hash, other algorithms will be affected by the length of the string, the longer the slower, I test with 10 characters in English. The last return of each function $hash% 701819; The 701819 representation is the maximum volume of the hash, meaning that the last number of these hash functions is 0~701819, the number can be changed is generally considered to use a large prime number results of the distribution will be more uniform, in the vicinity of 701819 of the recommended values are: 175447, 350899, 1403641, 2807303, 5614657.

What can this be used to do?

Why organize and test these hashing algorithms, I am writing multi-user Blog, uh ... Before the log also mentioned, multi-user Blog generally have a function, that is, the use of a combination of English and digital user name as the address of the blog (level two domain name or directory). Then there is a problem, how to get the user ID according to the user name, one more query? With the hash function is not necessary, using the hash function to process the user name, get a number, and then do some processing of the number (I was divided into 2-bit hierarchical directory, the purpose is to prevent a directory with too many files to affect the speed of disk retrieval), and then formed a path, The corresponding ID is saved in the file under this path (I personally recommend the user name to do the file name), so that you can directly obtain the user ID according to the user name, do not need to query, the user name to do the file name, so even if the final result is the same in different files, so you can not worry about collisions.

Of course... If your system is entirely based on the user name, then when I'm not talking about it. = =b, quietly criticized a SELECT is also a number faster than the string.

I chose the DJB algorithm, and so on after the launch if the test MD5 distribution is acceptable, then consider swapping.

It can also be seen from here that the hash is very useful for distribution, hehe, can be used for caching, static or other things that need to be distributed storage.

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.