1 How did the conflict arise?
In the above mentioned, the hash function is how to address the rules of the keyword, where the scope of the keyword is very wide, can be regarded as an infinite set, how to ensure the unlimited set of the original data when the site does not duplicate it? The rules themselves do not achieve this goal. To cite an example, still use the class to do the analogy, the existing data of the following students
John, Dick, Harry, Zhao Gang, Wulu ...
If we address the rule to take the first letter of last name in the relative position of the alphabet, the following hash table will be generated
Position |
Letters |
Name |
|
0 |
A |
|
|
1 |
B |
|
|
2 |
C |
|
|
...
...
..
We notice that the gray background marks the two lines inside, the keyword Harry, the Wulu is programmed to the same position, the keyword John, Zhao Gang is also programmed in the same position. The teacher then picked up the number to find John, there are two people in the seat, "who are you two John?" "
2 How to resolve conflict issues
Since conflicts cannot be avoided, it is clear that additional steps are needed to resolve the conflict. These steps are used to develop more rules for managing keyword collections, usually by:
(a) Open address law
Open law enforcement has a formula: hi= (H (key) +di) MOD m i=1,2,..., K (k<=m-1)
where M is the table length of the Hashtable. DI is the incremental sequence at which a conflict arises. If the DI value is likely to be a 1,2,3,... m-1, it is called a linear probe hash.
If di takes 1, a position is moved backward after each conflict. If Di value may be 1,-1,2,-2,4,-4,9,-9,16,-16,... k*k,-k*k (K<=M/2)
Weigh two times to detect and then hash. If di can be a pseudo random sequence. Called pseudo random probing and hashing. Still with the student automatic arranging as an example,
Existing two students, Dick, Wu used. Dick and Wu in advance have lined up, now a new classmate, named Harry, to it to prepare
10.. |
.... |
22 |
.. |
.. |
25 |
John doe.. |
.... |
Wu Yong |
.. |
.. |
25 |
Zhao just before the future
10.. |
.. |
22 |
23 |
25 |
John doe.. |
|
Wu Yong |
Harry |
|
(a) The linear detection and redistribution of the Zhao just to address, and Di=1
10.. |
20 |
22 |
.. |
25 |
John doe.. |
Harry |
Wu Yong |
|
|
(b) Two probes and hashes, and di=-2
1.. |
10.. |
22 |
.. |
25 |
Harry.. |
John doe.. |
Wu Yong |
|
|
(c) pseudo-random detection of 5,3,2, pseudo-random sequence:
b) Re-hashing method
When a conflict occurs, the second, third, and hash functions are used to compute the address until there is no conflict. Disadvantage: The calculation time increases.
For example, for the first time, the previous hash of the surname first letter, if there is a conflict can be based on the last letter first letter of the second hash, and then conflict, third, until no conflict
c) Chain Address method
Records all keywords as synonyms are stored in the same linear list. As follows:
So this method can be approximate to think that the package inside the bobbin
d) Establishment of a public overflow area
Assuming that the domain of the hash function is [0,m-1], set the vector hashtable[0..m-1] as the base table, and set up a storage space vector OVERTABLE[0..V] to store the conflicting records.
Through the above methods, can basically solve the problem of the hash algorithm conflict.
Note: The reason is simple to introduce the hash, is to better learn LZW algorithm, learning LZW algorithm is to better study the GIF file structure, and finally, I will elaborate on how the GIF file is composed, how to efficiently manipulate this type of file.
The above is the entire content of this article, I hope to give you a reference, but also hope that we support the cloud habitat community.