Algorithm 6-4: Hash representation

Source: Internet
Author: User
Tags sha1

War Stories


Very long, long ago, there had been a lot of war stories about hash functions.

The basic principle of those wars is that a lot of hash collisions are caused by careful construction, which consumes a lot of CPU resources.



Examples of the software being attacked are the following:

    • Server with vulnerability: The attacker carefully constructs a system conflict. Only a 56K speed can make the server crash, so as to achieve the purpose of Dos attacks.

    • Perl 5.8.0: An attacker carefully constructs a system of conflicts inserted into an associative array

    • Linux 2.4.20 Kernel: The attacker carefully constructs the file name, resulting in a large number of conflicts in the system, resulting in a slump in performance.


Attack principle


String objects in Java are very easy to construct a system conflict. Shows a sample of a system conflict in Java.



How to Solve


Use a more advanced HA system function. Avoid conflicts. For example MD4 MD5 sha0 SHA1 SHA2 Whirlpool ripemd160. But MD4 MD5 SHA0 SHA1 can now find defects, about the MD5 conflict please poke here: http://www.links.org/?

P=6


MD5 is not suitable for associative arrays, because the overhead is too high.


The comparison between the two ways


Two approaches to conflict resolution are presented at the moment, each independent linked list and a linear probe.


Independent linked list:

    • Deleting elements is convenient

    • As the amount of data is added. Slow performance degradation

    • The impact of the HA system on systems is small


Linear probes:

    • Use less memory. Since there is no linked list

    • Better Cache Performance


The improvement of the hash function


Very many different hashing algorithms have been implemented at the moment.


Double-valued Hash:

A hash function returns two hash values inserted into the shorter chain when the element is inserted.

Such a method can reduce the expected length of the chain.


Double hash:

The linear probe method is used, but after each conflict a different number of elements are skipped to find the empty space.

Such a method can be very good to eliminate the continuous placeholder. So that the hash table can be almost filled, but the deletion is very difficult to implement.


Cuckoo hash:

First a hash is generated and a position is computed, assuming there is a conflict. Add a few more references to continue the hash and calculate another position. Until you find a vacant position. The lookup operation for such a method is in the worst case the complexity is N.


Comparison of the table and the two-fork tree


Both hash tables and balanced trees can implement associative arrays.


Hash table:

    • Simple code

    • The performance of such a method is best

    • For simple keyword, this method is faster

    • Better system support in Java, such as the use of hashcode cache in string


Two-fork Tree:

    • Better Performance Assurance

    • Supports sequential operations (gets the rank of an element, the nth-largest value, the number of elements x through Y, and so on)

    • The CompareTo function is easier to implement than the Hashcode function, and not easy to make mistakes


There are implementations for both methods in the Java library.

Java.util.TreeMap Java.util.TreeSet is implemented by red-black trees, Java.util.HashMap java.util.IdentityHashMap is achieved through a hash table.


Algorithm 6-4: Hash representation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.