Two key issues to consider in hashing: design of hash functions and handling of conflicts
How to design a hash function:
1. Direct Addressing method
Basic idea: hash function is a linear function of keywords, such as: H (k) =a*k+b (A,b is constant, K is the keyword)
2. Method of excluding residue
Basic idea: Select an appropriate positive integer p, with the remainder of the keyword divided by P as the hash address, such as: H (k) =k mod p
3. Digital Analysis method
Basic idea: According to the distribution of the keywords in each bit, select a more evenly distributed number of bits to make hash address
4. Average take-in method
Basic idea: After the keyword squared, according to the hash table size, take a number of bits in the middle as a hash address (truncated after the square)
5. Folding method
Basic idea: Split the keyword from left to right into a number of equal parts, the last part of the number of digits can be shorter, and then add these parts to sum, and by the hash table length, take the latter several as a hash address
How conflicts are handled:
1. Open addressing method to deal with the conflict by open address the hash table is a closed hash table, according to the method of forming the detection sequence, open addressable method can be divided into linear detection method, two times detection method, pseudo random detection method and so on.
Linear detection: By Formula Hi= (H (k) +di) mod m (di=1,2,3,4......m-1) looking for the next hash address
Two detection method: According to the formula Hi= (H (k) +di) mod m (di=1 squared, 1 squared, 2 squared,-2 squared,... q squared, Q squared, and Q<=M/2) looking for the next hash address
Pseudo-Random detection method: Random detection method to detect the displacement of the next hash address is a random sequence.
2. Chain Address method
Basic idea: All records with the same hash address, that is, a synonym for all keywords, are stored in a single linked list, called the Synonym child table, and the header pointer of all synonym child tables is stored in the hash table. A hash table with a chain address processing conflict is used to open a hash table.