This blog All source code: GitHub Hosting Project address chain Hash Table Introduction
A chain hash table is fundamentally composed of a set of linked lists. Each list can be viewed as a "bucket," where we put all the elements in a different bucket by hashing them out. When you insert an element, you first pass its key into a hash function (called hashing), which tells the element which bucket it belongs to, and then inserts the element into the corresponding list header until we find the element we want to find. ways to resolve conflicts
1. Open approach
This method is also called the re-hashing method, the basic idea is: when the hash address p=h (key) of the keyword is in conflict, a second hash address is generated based on P P1, if P1 is still in conflict, and then based on p, generate another hash address P2 ..., Until a conflicting hash address pi is found, the corresponding element is deposited into it. This method has a common form of a hash function:
Hi= (H (key) +di)% m i=1,2,...,n
where H (key) is a hash function, M is the table length, and Di is called an increment sequence. the increment sequence is taken in different ways, and the corresponding hashes are different. There are mainly the following three kinds:
L linear probing and hashing
Dii=1,2,3,...,m-1
The feature of this approach is that when a conflict occurs, the next unit of the table is viewed sequentially until an empty cell is found or the entire table is searched.
L Two-time probing and hashing
DI=12,-12,22,-22,...,K2,-K2 (K<=M/2)
This method is characterized by: when the conflict occurs, in the table around the jump-detection, more flexible.
l pseudo random detection and hash
di= pseudo random number sequence.
When implemented, a pseudo random number generator (such as i= (i+p)% m) should be established, and given a random number to do the starting point.
For example, known hash table length m=11, hash function: H (key) = key% 11, then H (=3,h) =4,h (60) = 5, assuming the next keyword is 69, then H (69) = 3, and 47 conflict. If the conflict is handled with a linear probing hash, the next hashing address is h1= (3 + 1)% 11 = 4, still conflict, then find the next hash address is h2= (3 + 2)% 11 = 5, or the conflict, continue to find the next hash address is h3= (3 + 3)% 11 = 6, no longer conflict, 69 fill into unit 5th, parametric diagram 8.26 (a). If you use two probes to hash out the conflict, the next hash address is h1= (3 + 12)% 11 = 4, and the next hash address is h2= (3-12)% 11 = 2, no longer conflicting, 69 is filled in unit 2nd, and the parameter diagram 8.26 (b). If a pseudo random detection hash is used to handle the conflict, and the pseudorandom number sequence is: 2,5,9, ..., the next hash address is h1= (3 + 2)% 11 = 5, still conflict, and then find the next hash address is h2= (3 + 5)% 11 = 8, this time no longer conflict, 69 fill in unit 8th.
2. Re-hashing method
This approach is to construct several different hash functions at the same time:
HI=RH1 (Key) i=1,2,...,k
When the hash address HI=RH1 (key) conflicts, the HI=RH2 (key) is computed ... until the conflict is no longer generated. This method is not easy to generate aggregation, but increases the computational time.
3. Chain Address method
The basic idea of this method is that all the elements of the hash address I are composed of a single linked list called a synonym chain, and the head pointer of a single linked table exists in the first unit of the hash table, so finding, inserting and deleting are mainly done in the synonym chain. The link address method applies to situations where insertions and deletions are frequently performed.
4, the establishment of public overflow zone
The basic idea of this method is: the hash table is divided into the basic table and the overflow table two parts, and the elements that conflict with the basic table are all filled in by the method of constructing the hashtable of overflow tables directly . For example: There is a demographic table from 1-100 years old, in which, the age as a keyword, the hash function takes the keyword itself.
The digital analysis method has the student's birthday data as follows: