How does hash handle conflicts?

Source: Internet
Author: User

Part 1ArticleWhy is hash search fast? What is the price we need to pay for using it is just a brief introduction of the advantages and disadvantages of using hash. This article focuses on how to solve the hash addressing conflict.
1) How is a conflict generated?
As mentioned above, hash functions refer to the rules on how to edit keywords. The keyword range here is wide and can be considered as an infinite set, how can we ensure that the original data of an infinite set will not be duplicated during addressing? Rules themselves cannot achieve this purpose. For example, the following data is still illustrated by the class.
Zhang San, Li Si, Wang Wu, Zhao Gang, Wu Lu .....
If our addressing rule is to take the first letter of the last name in the alphabet as the address, the following hash table will be generated

Location Letter Name
0 A
1 B
2 C

...

10 L Li Si

...

22 W Wang Wu, Wu Lu

..

25 Z Zhang San, Zhao Gang

we noticed that in the two lines marked with a gray background, the keyword Wang Wu and Wu Lu were coded in the same position, and the keyword Zhang San and Zhao Gang were also coded in the same position. The teacher asked Michael for another number. There were two people in his seat. "Who are you two, James? "
2) how to solve the conflict
since the conflict cannot be avoided, it is clear that the steps to resolve the conflict need to be attached. Through these steps, you can develop more rules to manage keyword sets. The common methods are as follows:
) open address method
there is a formula for open address law enforcement: HI = (H (key) + DI) mod m I = 1, 2 ,..., K (k <= m-1)
where M is the table length of the hash table. Di is the incremental sequence when a conflict occurs. If the di value may be 1, 2, 3, M-1, it is called linear detection and then hashed.
If Di is set to 1, after each conflict, move one position backward. if the di value may be 1,-,-9,-9, 16,-16 ,... K * k,-K * K (k <= m/2)
indicates secondary detection and re-partitioning. If the di value may be a pseudo-random series. It is called pseudo-random detection and then hashed. The student ID is still used as an example.
there are two students, Li Si and Wu Yong. Li Si and Wu Yong have arranged the order in advance. Now a new student named Wang Wu is coming to compile it.

10 .. .... 22 .. .. 25
Li Si .. .... Wu Yong .. .. 25

Zhao Gang before the future

10 .. .. 22 23 25
Li Si .. Wu Yong Wang Wu

(A) linear detection further hashes Zhao Gang for addressing, and di = 1

10... 20 22 .. 25
Li Si .. Wang Wu Wu Yong

(B) Secondary detection and then hash, and di =-2

1... 10... 22 .. 25
Wang Wu .. Li Si .. Wu Yong

(C) pseudo-random detection and re-partitioning. the pseudo-random sequence is: 5, 3, 2

B) re-Hash
when a conflict occurs, use the second, third, and hash functions to calculate the address until there is no conflict. Disadvantage: the computing time increases.
for example, if the first hash is based on the first letter of the last name, if a conflict occurs, the second character of the first letter of the last name can be used as a hash, followed by a conflict, followed by a third character, until no conflict exists
C) the link address method
stores all records with synonyms in the same Linear Linked List.
therefore, this method can be considered as a package in the package.
D. create a public overflow zone
assume that the value of the hash function is [0 M-1], then set the vector hashtable [0 .. m-1 the storage space vector overtable [0 .. v] used to store conflicting records.
after the above methods, the hash algorithm conflict problem can be basically solved.
Note: hash is introduced to better learn LZW algorithms , I learned the LZW algorithm to better study the GIF file structure. Finally, I will explain in detail how GIF files are made up and how to operate these types of files efficiently.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.