Hash Algorithm Summary Collection

Source: Internet
Author: User
Tags key string

The meaning of hash algorithm is to provide a method of fast access data, it uses an algorithm to establish the correspondence between the key value and the real value, (each real value can have only one key value, but a key value may correspond to many real values), so that the data can be accessed quickly in the array and other conditions.
On the Internet to read a lot of hash data, so the hash of the relevant information to summarize and collect.
HashTable.h template class hashtable{public:hashtable (int count); void put (t* T, int key); t* get (int key); private:t** Tarray; }
HashTable.cpp template hashtable::hashtable (int count) {Tarray = new T*[count];} template void HashTable::p ut (t* T, int key) {this->tarray[key] = t;} Template t* hashtable::get (int key) {return this->tarray[key];}
In this way, we can quickly access the T-type data as long as we know the key value, rather than looking for it in a data structure such as a linked list. As for the key value, it is usually calculated using some algorithm (the so-called hash algorithm). For example: The hash algorithm of the string, char* value = "Hello"; int key = ((((27* (int) ' H ' +27) * (int) ' E ') + +) * (int) ' l ') + +) * (int) ' L ' +27) * +) + (int) ' O '; hash function processing process hash, the general translation do "hash", there is a direct transliteration of "hash", is the arbitrary length of input (also known as pre-mapping, pre-image), through the hash algorithm, transformed into a fixed-length output, the output is the hash value. This conversion is a compression map, that is, the space of the hash value is usually much smaller than the input space, the different inputs may be hashed to the same output, but not from the hash value to uniquely determine the input value. Simply put, it is an encryption that transforms the input of any content into the same length output.

Let me make a metaphor.
We have a lot of piglets, each weight is different, assuming that the weight distribution is relatively average (we consider the kilogram level), we divide by weight, divided into 100 small pigsty.
Then each pig, in accordance with the weight of the drive into their own pigsty, record files.
Well, what if we're going to find some little piggy? We need every pigsty, every piggy, right?
Of course it's not necessary.
Let's look at the weight of the piggy we're looking for, and we'll find the corresponding pigsty.
The number of piglets in this pigsty is relatively small.
In this pigsty we can find the little pig that we want to find relatively quickly.
Corresponds to the hash algorithm.
is to assign a different pigsty according to Hashcode, and put the same pig hashcode in a pigsty.
When looking, first find the hashcode corresponding to the pigsty, and then compare the inside of the pig.
So the crux of the matter is how many pigsty it is appropriate to build.
If each pig's weight is all different (considering the milligram level), each of which builds a pigsty, we can find the pig at the fastest speed. The disadvantage is that the cost of building so many pigsty is a little too high.
If we divide by 10 kilograms, then there are only a few pens built, so there are a lot of pigs in each lap. Although we can find the pigsty very quickly, it is very tiring to identify the pig from the pigsty.
So, good hashcode, can be based on the actual situation, according to the specific needs, in the time cost (more pigsty, faster speed) and space Ben (less pigsty, lower space demand) balance between.

There are many kinds of hash algorithms. I can refer to some analysis of the hash algorithm I wrote earlier. The department provides a collection of a lot of the use of hash algorithm classes, should be able to meet the needs of many people:
Java code

Commonly used string hash function and Elfhash,aphash, and so on, are very simple and effective method. These functions use bitwise arithmetic to make each character affect the last function value. There are also hash functions represented by MD5 and SHA1, which are almost impossible to find collisions.


Commonly used string hash functions have bkdrhash,aphash,djbhash,jshash,rshash,sdbmhash,pjwhash,elfhash and so on. For the above hash functions, I have a small evaluation of them.





hash function data 1 data 2 data 3 Data 4 Data 1 score Data 2 score Data 3 score data 4 score average score
Bkdrhash 2 0 4774 481 96.55 100 90.95 82.05 92.64
Aphash 2 3 4754 493 96.55 88.46 100 51.28 86.28
Djbhash 2 2 4975 474 96.55 92.31 0 100 83.43
Jshash 1 4 4761 506 100 84.62 96.83 17.95 81.94
Rshash 1 0 4861 505 100 100 51.58 20.51 75.96
Sdbmhash 3 2 4849 504 93.1 92.31 57.01 23.08 72.41
Pjwhash 30 26 4878 513 0 0 43.89 0 21.95
Elfhash 30 26 4878 513 0 0 43.89 0 21.95



Where data 1 is the number of random string hash collisions consisting of 100,000 letters and numbers. Data 2 is the number of 100,000 meaningful English sentence hash collisions. Data 3 is the number of conflicts that are stored in a linear table after the hash value of data 1 is modeled with 1000003 (large prime). Data 4 is the number of conflicts that are stored in a linear table after the hash value of data 1 is modeled with 10000019 (greater prime).


After comparison, the above average score is obtained. The average is the square average. It can be found that the Bkdrhash effect is the most prominent in both actual and coding implementations. Aphash is also an excellent algorithm. Djbhash,jshash,rshash and Sdbmhash have their own merits. Pjwhash and Elfhash have the worst effect, but the scores are similar and the algorithms are similar in nature.


In information repair competition, in line with the principle of easy coding and debugging, personally think that Bkdrhash is the most suitable for memory and use

C + + implements various hash algorithms:

#define M 249997 #define M1 1000001 #define M2 0xf0000000//RS Hash Function unsigned int rshash (CHAR*STR) {Unsi     gned int b=378551;     unsigned int a=63689;           unsigned int hash=0;         while (*STR) {hash=hash*a+ (*str++);     A*=b; } return (hash% M);           }//JS Hash Function unsigned int jshash (CHAR*STR) {unsigned int hash=1315423911;     while (*STR) {hash^= ((hash<<5) + (*str++) + (hash>>2)); } return (hash% M); }//P. J. Weinberger Hash Function unsigned int pjwhash (CHAR*STR) {unsigned int bitsinunignedint= (unsigned int) (s     izeof (unsigned int) *8);     unsigned int threequarters= (unsigned int) ((bitsinunignedint*3)/4);     unsigned int oneeighth= (unsigned int) (BITSINUNIGNEDINT/8);     unsigned int highbits= (unsigned int) (0xFFFFFFFF) << (bitsinunignedint-oneeighth);     unsigned int hash=0;           unsigned int test=0; while (*STR) {hash= (hash<<oneeighth)+ (*str++);         if ((test=hash&highbits)!=0) {hash= ((hash^ (Test>>threequarters)) & (~highbits)); }} return (hash% M);     }//ELF Hash Function unsigned int elfhash (CHAR*STR) {unsigned int hash=0;           unsigned int x=0;         while (*STR) {hash= (hash<<4) + (*str++);             if ((x=hash&0xf0000000l)!=0) {hash^= (x>>24);         Hash&=~x; }} return (hash% M);       }//BKDR Hash Function unsigned int bkdrhash (CHAR*STR) {unsigned int seed=131;//131 1313 13131-131313 etc..           unsigned int hash=0;     while (*STR) {hash=hash*seed+ (*str++); } return (hash% M);           }//SDBM Hash Function unsigned int sdbmhash (CHAR*STR) {unsigned int hash=0;     while (*STR) {hash= (*str++) + (hash<<6) + (hash<<16)-hash; } return (hash% M); }//DJB Hash Function unsigned int DJbhash (CHAR*STR) {unsigned int hash=5381;     while (*STR) {hash+= (hash<<5) + (*str++); } return (hash% M);     }//AP Hash Function unsigned int aphash (CHAR*STR) {unsigned int hash=0;           int i;         for (i=0;*str;i++) {if ((i&1) ==0) {hash^= ((hash<<7) ^ (*str++) ^ (hash>>3));         } else {hash^= ((hash<<11) ^ (*str++) ^ (hash>>5))); }} return (hash% M); }/** * Hash algorithm Daquan <br> * recommended using FNV1 algorithm * @algorithm None * @author goodzzp 2006-11-20 * @lastEdit GOODZZP 2 006-11-20 * @editDetail Create */public class Hashalgorithms {/** * add hash * @param key string * @param p       Rime a prime number * @return Hash result */public static int Additivehash (String key, int prime) {int hash, I;       for (hash = Key.length (), i = 0; i < key.length (); i++) Hash + = Key.charat (i); Return (hash% prime);   }/** * Spin hash * @param key input String * @param prime prime number * @return Hash value */public static int Rotatinghash    (String key, int prime)       {int hash, I;       For (Hash=key.length (), i=0; I<key.length (); ++i) hash = (hash<<4) ^ (hash>>28) ^key.charat (i);    Return (hash% prime);    Return (hash ^ (hash>>10) ^ (hash>>20));    }//Substitute://use: hash = (hash ^ (hash>>10) ^ (hash>>20)) & mask;            Substitution: hash%= prime;    /** * MASK value, look for a value, preferably a prime number */static int m_mask = 0X8765FED1;   /** * One hash at a time * @param key Input String * @return Output hash value */public static int Onebyonehash (string key) {int       hash, I;         For (hash=0, i=0; I<key.length (); ++i) {hash + = Key.charat (i);         Hash + = (hash << 10);       Hash ^= (hash >> 6);       } hash + = (hash << 3);       Hash ^= (hash >> 11);    Hash + = (hash << 15); Return (hash &Amp       M_mask);    return hash; }/** * Bernstein ' s hash * @param key input byte array * @param level initial hash constant * @return result hash */public static I       NT Bernstein (String key) {int hash = 0;       int i;       for (i=0; I<key.length (); ++i) hash = 33*hash + Key.charat (i);    return hash;    }//////Pearson ' s hash//char Pearson (Char[]key, Ub4 Len, Char tab[256])//{//char hash;    Ub4 i;    For (Hash=len, i=0; i<len; ++i)//hash=tab[hash^key[i]];    return (hash); ////CRC Hashing, calculate CRC, specific code see other//Ub4 CRC (char *key, Ub4 Len, Ub4 mask, ub4 tab[256])//{//UB4 H    Ash, I;    For (Hash=len, i=0; i<len; ++i)//hash = (Hash >> 8) ^ tab[(hash & 0xff) ^ key[i]];    Return (hash & mask); }/** * Universal Hashing */public static int Universal (char[]key, int mask, int[] tab) {int have       h = key.length, I, len = key.length; for (i=0; i< (len<<3);         i+=8) {Char k = key[i>>3];         if ((k&0x01) = = 0) hash ^= tab[i+0];         if ((k&0x02) = = 0) hash ^= tab[i+1];         if ((k&0x04) = = 0) hash ^= tab[i+2];         if ((k&0x08) = = 0) hash ^= tab[i+3];         if ((k&0x10) = = 0) hash ^= tab[i+4];         if ((k&0x20) = = 0) hash ^= tab[i+5];         if ((k&0x40) = = 0) hash ^= tab[i+6];       if ((k&0x80) = = 0) hash ^= tab[i+7];    } return (hash & mask);  }/** * Zobrist Hashing */public static int Zobrist (char[] Key,int mask, int[][] tab) {int hash,       I       For (Hash=key.length, i=0; i<key.length; ++i) hash ^= tab[i][key[i]];    Return (hash & mask);    }//LOOKUP3//See Bob Jenkins (3). c File//32-bit FNV algorithm static int m_shift = 0;            /** * 32-bit FNV algorithm * @param data array * @return int value */public static int Fnvhash (byte[] data) { int hash = (int) 2166136261L            for (byte b:data) hash = (Hash * 16777619) ^ b;            if (M_shift = = 0) return hash;        Return (hash ^ (hash >> m_shift)) & M_mask; }/** * Improved 32-bit FNV algorithm 1 * @param data array * @return int value */public static int FNVHash1        (byte[] data)            {Final int p = 16777619;            int hash = (int) 2166136261L;            for (byte b:data) hash = (hash ^ b) * p;            Hash + = Hash << 13;            Hash ^= Hash >> 7;            Hash + = Hash << 3;            Hash ^= Hash >> 17;            Hash + = Hash << 5;        return hash; }/** * Improved 32-bit FNV algorithm 1 * @param data String * @return int value */public static int Fnvhash            1 (String data) {final int p = 16777619;            int hash = (int) 2166136261L; for (int i=0;i<data.length (); i++) hash = (haveH ^ data.charat (i)) * p;            Hash + = Hash << 13;            Hash ^= Hash >> 7;            Hash + = Hash << 3;            Hash ^= Hash >> 17;            Hash + = Hash << 5;        return hash; }/** * Thomas Wang Algorithm, integer hash */public static int inthash (int key) {key          + = ~ (key << 15);          Key ^= (key >>> 10);          Key + = (key << 3);          Key ^= (key >>> 6);          Key + = ~ (key << 11);          Key ^= (key >>> 16);        Return key;            }/** * RS algorithm hash * @param STR string */public static int Rshash (String str) {            int b = 378551;            int a = 63689;               int hash = 0;              for (int i = 0; i < str.length (); i++) {hash = hash * A + str.charat (i);           A = a * b;      } return (hash & 0x7FFFFFFF);  }/* End of RS Hash Function *//** * JS algorithm */public static int Jshash (String str)               {int hash = 1315423911; for (int i = 0; i < str.length (); i++) {hash ^= (hash << 5) + Str.charat (i) + (hash >& Gt           2));        } return (hash & 0x7FFFFFFF);        }/* End of JS Hash Function *//** * PJW algorithm */public static int Pjwhash (String str)            {int bitsinunsignedint = 32;            int threequarters = (Bitsinunsignedint * 3)/4;            int oneeighth = BITSINUNSIGNEDINT/8;            int highbits = 0xFFFFFFFF << (bitsinunsignedint-oneeighth);            int hash = 0;               int test = 0;                  for (int i = 0; i < str.length (); i++) {hash = (hash << oneeighth) + Str.charat (i); if (test = hash & HighbiTS)! = 0) {hash = ((hash ^ (test >> threequarters)) & (~highbits));        }} return (hash & 0x7FFFFFFF); }/* End of P. J. Weinberger Hash Function *//** * ELF algorithm */public static int Elfhash (            String str) {int hash = 0;               int x = 0;              for (int i = 0; i < str.length (); i++) {hash = (hash << 4) + Str.charat (i);                 if ((x = (int) (hash & 0xf0000000l))! = 0) {hash ^= (x >> 24);              Hash &= ~x;        }} return (hash & 0x7FFFFFFF);         }/* End of ELF Hash Function *//** * BKDR algorithm */public static int Bkdrhash (String str)            {int seed = 131;//131 1313 13131 131313 etc..               int hash = 0; for (int i = 0; i < str.length (); i++) {hash = (hash * seed) + Str.charat (i);        } return (hash & 0x7FFFFFFF); }/* End of BKDR Hash Function *//** * SDBM algorithm */public static int Sdbmhash (String str               ) {int hash = 0; for (int i = 0; i < str.length (); i++) {hash = Str.charat (i) + (hash << 6) + (hash <&lt ;           )-Hash;        } return (hash & 0x7FFFFFFF);        }/* End of SDBM Hash Function *//** * DJB algorithm */public static int Djbhash (String str)               {int hash = 5381;           for (int i = 0; i < str.length (); i++) {hash = ((hash << 5) + hash) + Str.charat (i);        } return (hash & 0x7FFFFFFF);        }/* End of DJB Hash Function *//** * DEK algorithm */public static int Dekhash (String str) {iNT hash = str.length (); for (int i = 0; i < str.length (); i++) {hash = ((hash << 5) ^ (hash >>)) ^ str.ch           Arat (i);        } return (hash & 0x7FFFFFFF);        }/* End of DEK Hash Function *//** * AP Algorithm */public static int Aphash (String str)               {int hash = 0; for (int i = 0; i < str.length (); i++) {Hash ^= ((i & 1) = = 0)? (Hash << 7) ^ Str.charat (i) ^ (hash >> 3)): ((hash << one) ^ s           Tr.charat (i) ^ (hash >> 5));           }//Return (hash & 0x7FFFFFFF);        return hash; }/* End of the AP Hash Function *//** * java own algorithm */public static int Java (String       STR) {int h = 0;       int off = 0;       int len = Str.length (); for (int i = 0; i < len; i++) {h =* H + str.charat (off++);    } return h; }/** * Mixed hash algorithm, output 64-bit value */public static long Mixhash (String str) {L         Ong hash = Str.hashcode ();         Hash <<= 32;        Hash |= FNVHash1 (str);         return hash; }

  

[Translated from: http://my.oschina.net/u/2007546/blog/425681]

Hash Algorithm Summary Collection

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.