Hash Table collision processing.

Source: Internet
Author: User

There are two policies to solve the hash table collision problem. The first policy is open addressing. If the current position in the array is occupied, it selects a new position for the current data. The second policy is separate chaining, place a linked list at each position of the array.

1. Open addressing

If the location where the current data is repaired is occupied, this policy selects a new location for the data. There are three methods for this test:

A. Liner probing

This method uses the current position as the starting point to linearly search for idle positions. If the hash function selects the nth position for the current data, but the N position is occupied, this method will try n + 1, n + 2 ........., find an idle position. If it reaches the end of the array, continue searching from the array header. The process of trying different locations is called probing.

It is called liner probing because the process looks like linear search.

When you need to search for data in the hash table, if you also use probing, when you use the key to search for data, and the position obtained by the hash function is X, if the X position contains other data, then continue to search for x + 1, x + 2 until the target element is found, or the position of X + k is null, or re-facilitates to the position of X.

The problem with liner probing is clustering (aggregation), where data elements are distributed in blocks. If you insert a value mapped to the X position, but X is occupied, the insertion method will try x + 1. If the next element is mapped to x + 1, and this position is occupied, he must continue to try again. Data will be integrated in one piece. In practical applications, if the inserted data is similar, data aggregation is generated, resulting in a reduction in the efficiency of searching and inserting data.

The solution to this problem is to move the ahead, not just a forward element. In quadratic probing, the insertion method will try x + 1 ^ 2, x + 2 ^ 2, x + 3 ^ 2 ......... it is still necessary to find the idle location.

B. Quadratic probing

The benefit of quadratic probing is that it can reduce clustering, because the probing offset is n ^ 2 rather than 1, it won't make the data very close. However, if a lot of data is mapped to the same location at the same time, this method does not help. When a lot of data is mapped to the same location, it will try x + 1, x + 4, X + 9 ............... this makes data search difficult.

C. Double hashing

In double hashing, the offset of probing depends on the key value. When the key is mapped to the position X, and X is occupied, use the second hash function to process the key to obtain y, and try X + Y, x + 2y, X

+ 3y... until you find a position that can be inserted. The purpose of selecting the second hash function is: the hash function value is always greater than or equal to 1. the hash result for the key is different from that for the first hash function. Generally, the second hash function is written as follows:

Probe offset = C-Key % C, C is a constant smaller than the array size

2. Separate chaining

This method uses a linked list at every position of the array.

There are two policies to solve the hash table collision problem. The first policy is open addressing. If the current position in the array is occupied, it selects a new position for the current data. The second policy is separate chaining, place a linked list at each position of the array.

1. Open addressing

If the location where the current data is repaired is occupied, this policy selects a new location for the data. There are three methods for this test:

A. Liner probing

This method uses the current position as the starting point to linearly search for idle positions. If the hash function selects the nth position for the current data, but the N position is occupied, this method will try n + 1, n + 2 ........., find an idle position. If it reaches the end of the array, continue searching from the array header. The process of trying different locations is called probing.

It is called liner probing because the process looks like linear search.

When you need to search for data in the hash table, if you also use probing, when you use the key to search for data, and the position obtained by the hash function is X, if the X position contains other data, then continue to search for x + 1, x + 2 until the target element is found, or the position of X + k is null, or re-facilitates to the position of X.

The problem with liner probing is clustering (aggregation), where data elements are distributed in blocks. If you insert a value mapped to the X position, but X is occupied, the insertion method will try x + 1. If the next element is mapped to x + 1, and this position is occupied, he must continue to try again. Data will be integrated in one piece. In practical applications, if the inserted data is similar, data aggregation is generated, resulting in a reduction in the efficiency of searching and inserting data.

The solution to this problem is to move the ahead, not just a forward element. In quadratic probing, the insertion method will try x + 1 ^ 2, x + 2 ^ 2, x + 3 ^ 2 ......... it is still necessary to find the idle location.

B. Quadratic probing

The benefit of quadratic probing is that it can reduce clustering, because the probing offset is n ^ 2 rather than 1, it won't make the data very close. However, if a lot of data is mapped to the same location at the same time, this method does not help. When a lot of data is mapped to the same location, it will try x + 1, x + 4, X + 9 ............... this makes data search difficult.

C. Double hashing

In double hashing, the offset of probing depends on the key value. When the key is mapped to the position X, and X is occupied, use the second hash function to process the key to obtain y, and try X + Y, x + 2y, X

+ 3y... until you find a position that can be inserted. The purpose of selecting the second hash function is: the hash function value is always greater than or equal to 1. the hash result for the key is different from that for the first hash function. Generally, the second hash function is written as follows:

Probe offset = C-Key % C, C is a constant smaller than the array size

2. Separate chaining

This method uses a linked list at every position of the array.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.