Data structure and algorithm-hash table

Source: Internet
Author: User

Objective

A hash table is a data structure that holds key-value pairs, where values are used to store the data that we really need, and the key is to find the value. Hash table Ideally, only need a hash calculation can find value data, but usually we do not need to spend a lot of extra space to pursue the slightest search speed (to pursue the low hash conflict rate, it is necessary to enlarge the hash table), we would like to allow space and time to achieve a certain balance, This can be solved by adjusting the hash function (filling factor).

Reload factor = The number of records in the table/the length of the hash table, the smaller the filling factor, the more empty cells in the table, the less likely the conflict is, and the greater the filling factor, the greater the likelihood of a conflict The more time it takes to find (the default reload factor for HashMap in the JDK is 0.75).

hash function

When we use a string as a key, we can use it as a large integer, using the retention method of the remainder. We can take each character that makes up a string and then hash it, for example

public int GetHashCode (string str) {    char[] s = Str.tochararray ();    int hash = 0;    for (int i = 0; i < s.length; i++)    {        hash = S[i] + (* hash);     }    return hash;}

The above hash value is a method that Horner computes a string hash, with the formula:

h = s[0] 31l–1 + ... + s[l–3] 312 + s[l–2] 311 + s[l–1] 310

For example, to get the hash value of the string "call", the Unicode corresponding to 99,a for the string C corresponds to 97,l for Unicode of 108, so the hash value of the string "call" is 3045982 = 99 313 + 97 312 + 108 311 + 108 310 = 108 + 31 · (108 + 31 · (97 + 31 · (99)))

For computers, multiplication is a fairly expensive calculation, 31*hash is equivalent to Hash<<5-hash, and bit operations are more comfortable than multiplication.

Avoid hash collisions

There are many ways to avoid having conflicts, but the zipper method (separate chaining with linked lists) is commonly used in engineering.

For example, John Smith and Sandra Dee will be mapped at 152 after the hash, which creates a conflict, which is resolved by concatenating the conflicting entry (Key-value) (there are many ways to concatenate, and the simplest is a single-linked list, Some of the more complex can be two-fork balanced trees, as long as the data structures that can be efficiently searched are available.

Data structure and algorithm-hash table

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.