There are two policies to solve the hash table collision problem. The first policy is open addressing. If the current position in the array is occupied, it selects a new position for the current data. The second policy is separate chaining, place a linked list at each position of the array.
1. Open addressing
If the location where the current data is repaired is occupied, this policy selects a new location for the data. There are three methods for this test:
A. Liner probing
This method uses the current position as the starting point to linearly search for idle positions. If the hash function selects the nth position for the current data, but the N position is occupied, this method will try n + 1, n + 2 ........., find an idle position. If it reaches the end of the array, continue searching from the array header. The process of trying different locations is called probing.
It is called liner probing because the process looks like linear search.
When you need to search for data in the hash table, if you also use probing, when you use the key to search for data, and the position obtained by the hash function is X, if the X position contains other data, then continue to search for x + 1, x + 2 until the target element is found, or the position of X + k is null, or re-facilitates to the position of X.
The problem with liner probing is clustering (aggregation), where data elements are distributed in blocks. If you insert a value mapped to the X position, but X is occupied, the insertion method will try x + 1. If the next element is mapped to x + 1, and this position is occupied, he must continue to try again. Data will be integrated in one piece. In practical applications, if the inserted data is similar, data aggregation is generated, resulting in a reduction in the efficiency of searching and inserting data.
The solution to this problem is to move the ahead, not just a forward element. In quadratic probing, the insertion method will try x + 1 ^ 2, x + 2 ^ 2, x + 3 ^ 2 ......... it is still necessary to find the idle location.
B. Quadratic probing
The benefit of quadratic probing is that it can reduce clustering, because the probing offset is n ^ 2 rather than 1, it won't make the data very close. However, if a lot of data is mapped to the same location at the same time, this method does not help. When a lot of data is mapped to the same location, it will try x + 1, x + 4, X + 9 ............... this makes data search difficult.
C. Double hashing
In double hashing, the offset of probing depends on the key value. When the key is mapped to the position X, and X is occupied, use the second hash function to process the key to obtain y, and try X + Y, x + 2y, X
+ 3y... until you find a position that can be inserted. The purpose of selecting the second hash function is: the hash function value is always greater than or equal to 1. the hash result for the key is different from that for the first hash function. Generally, the second hash function is written as follows:
Probe offset = C-Key % C, C is a constant smaller than the array size
2. Separate chaining
This method uses a linked list at every position of the array.
There are two policies to solve the hash table collision problem. The first policy is open addressing. If the current position in the array is occupied, it selects a new position for the current data. The second policy is separate chaining, place a linked list at each position of the array.
1. Open addressing
If the location where the current data is repaired is occupied, this policy selects a new location for the data. There are three methods for this test:
A. Liner probing
This method uses the current position as the starting point to linearly search for idle positions. If the hash function selects the nth position for the current data, but the N position is occupied, this method will try n + 1, n + 2 ........., find an idle position. If it reaches the end of the array, continue searching from the array header. The process of trying different locations is called probing.
It is called liner probing because the process looks like linear search.
When you need to search for data in the hash table, if you also use probing, when you use the key to search for data, and the position obtained by the hash function is X, if the X position contains other data, then continue to search for x + 1, x + 2 until the target element is found, or the position of X + k is null, or re-facilitates to the position of X.
The problem with liner probing is clustering (aggregation), where data elements are distributed in blocks. If you insert a value mapped to the X position, but X is occupied, the insertion method will try x + 1. If the next element is mapped to x + 1, and this position is occupied, he must continue to try again. Data will be integrated in one piece. In practical applications, if the inserted data is similar, data aggregation is generated, resulting in a reduction in the efficiency of searching and inserting data.
The solution to this problem is to move the ahead, not just a forward element. In quadratic probing, the insertion method will try x + 1 ^ 2, x + 2 ^ 2, x + 3 ^ 2 ......... it is still necessary to find the idle location.
B. Quadratic probing
The benefit of quadratic probing is that it can reduce clustering, because the probing offset is n ^ 2 rather than 1, it won't make the data very close. However, if a lot of data is mapped to the same location at the same time, this method does not help. When a lot of data is mapped to the same location, it will try x + 1, x + 4, X + 9 ............... this makes data search difficult.
C. Double hashing
In double hashing, the offset of probing depends on the key value. When the key is mapped to the position X, and X is occupied, use the second hash function to process the key to obtain y, and try X + Y, x + 2y, X
+ 3y... until you find a position that can be inserted. The purpose of selecting the second hash function is: the hash function value is always greater than or equal to 1. the hash result for the key is different from that for the first hash function. Generally, the second hash function is written as follows:
Probe offset = C-Key % C, C is a constant smaller than the array size
2. Separate chaining
This method uses a linked list at every position of the array.