hash function comparison

Source: Internet
Author: User

http://blog.csdn.net/kingstar158/article/details/8028635

Because of the work needs, for tens other data, the use of stl::map there is really a problem of efficiency, and finally use Boost::unordered_map replace the former, found that there is a great improvement in efficiency, but still can not meet our needs;

Stl::map underlying algorithm: B+tree implementation

Boost::unordered_map Bottom algorithm: hash implementation

So you might want to write hash function for different data types to optimize the efficiency of the search and insertion, write yourself, really do not have this strength, only on Google to find the ancestors of the ingenious algorithm to learn from:

Commonly used string hash functions have bkdrhash,aphash,djbhash,jshash,rshash,sdbmhash,pjwhash,elfhash and so on;

Have predecessors did the evaluation: the following

Ash function Data 1 Data 2 Data 3 Data 4 Data 1 Score Data 2 Score Data 3 Score Data 4 Score Average score
Bkdrhash 2 0 4774 481 96.55 100 90.95 82.05 92.64
Aphash 2 3 4754 493 96.55 88.46 100 51.28 86.28
Djbhash 2 2 4975 474 96.55 92.31 0 100 83.43
Jshash 1 4 4761 40W 100 84.62 96.83 17.95 81.94
Rshash 1 0 4861 505 100 100 51.58 20.51 75.96
Sdbmhash 3 2 4849 504 93.1 92.31 57.01 23.08 72.41
Pjwhash 30 26 4878 513 0 0 43.89 0 21.95
Elfhash 30 26 4878 513 0 0 43.89 0 21.95


Where data 1 is the number of random string hash collisions consisting of 100,000 letters and numbers. Data 2 is the number of 100,000 meaningful English sentence hash collisions. Data 3 is the number of conflicts that are stored in a linear table after the hash value of data 1 is modeled with 1000003 (large prime). Data 4 is the number of conflicts that are stored in a linear table after the hash value of data 1 is modeled with 10000019 (greater prime).

C language implementations of various hash function:

[CPP]View Plaincopyprint?
    1. unsigned int sdbmhash (char *str)
    2. {
    3. unsigned int hash = 0;
    4. While (*STR)
    5. {
    6. //Equivalent To:hash = 65599*hash + (*str++);
    7. hash = (*str++) + (hash << 6) + (hash << +)-hash;
    8. }
    9. return (hash & 0x7FFFFFFF);
    10. }
    11. RS Hash Function
    12. unsigned int rshash (char *str)
    13. {
    14. unsigned int b = 378551;
    15. unsigned int a = 63689;
    16. unsigned int hash = 0;
    17. While (*STR)
    18. {
    19. hash = hash * A + (*str++);
    20. a *= b;
    21. }
    22. return (hash & 0x7FFFFFFF);
    23. }
    24. JS Hash Function
    25. unsigned int jshash (char *str)
    26. {
    27. unsigned int hash = 1315423911;
    28. While (*STR)
    29. {
    30. Hash ^= (Hash << 5) + (*str++) + (hash >> 2));
    31. }
    32. return (hash & 0x7FFFFFFF);
    33. }
    34. P. J. Weinberger Hash Function
    35. unsigned int pjwhash (char *str)
    36. {
    37. unsigned int bitsinunignedint = (unsigned int) (sizeof (unsigned int) * 8);
    38. unsigned int threequarters = (unsigned int) ((Bitsinunignedint * 3)/4);
    39. unsigned int oneeighth = (unsigned int) (BITSINUNIGNEDINT/8);
    40. unsigned int highbits = (unsigned int) (0xFFFFFFFF) << (bitsinunignedint-oneeighth);
    41. unsigned int hash = 0;
    42. unsigned int test = 0;
    43. While (*STR)
    44. {
    45. hash = (hash << oneeighth) + (*str++);
    46. if (test = hash & highbits)! = 0)
    47. {
    48. hash = ((hash ^ (test >> threequarters)) & (~highbits));
    49. }
    50. }
    51. return (hash & 0x7FFFFFFF);
    52. }
    53. ELF Hash Function
    54. unsigned int elfhash (char *str)
    55. {
    56. unsigned int hash = 0;
    57. unsigned int x = 0;
    58. While (*STR)
    59. {
    60. hash = (hash << 4) + (*str++);
    61. if ((x = hash & 0xf0000000l)! = 0)
    62. {
    63. Hash ^= (x >> 24);
    64. Hash &= ~x;
    65. }
    66. }
    67. return (hash & 0x7FFFFFFF);
    68. }
    69. BKDR Hash Function
    70. unsigned int bkdrhash (char *str)
    71. {
    72. unsigned int seed = 131;  //131 1313 13131 131313 etc..
    73. unsigned int hash = 0;
    74. While (*STR)
    75. {
    76. hash = hash * seed + (*str++);
    77. }
    78. return (hash & 0x7FFFFFFF);
    79. }
    80. DJB Hash Function
    81. unsigned int djbhash (char *str)
    82. {
    83. unsigned int hash = 5381;
    84. While (*STR)
    85. {
    86. Hash + = (hash << 5) + (*str++);
    87. }
    88. return (hash & 0x7FFFFFFF);
    89. }
    90. AP Hash Function
    91. unsigned int aphash (char *str)
    92. {
    93. unsigned int hash = 0;
    94. int i;
    95. For (i=0; *str; i++)
    96. {
    97. if ((i & 1) = = 0)
    98. {
    99. Hash ^= (Hash << 7) ^ (*str++) ^ (hash >> 3));
    100. }
    101. Else
    102. {
    103. Hash ^= ((hash << one) ^ (*str++) ^ (hash >> 5));
    104. }
    105. }
    106. return (hash & 0x7FFFFFFF);
    107. }
    108. Https://www.byvoid.com/blog/string-hash-compare/

hash function comparison

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.