Data structures and algorithms for HashMap

Source: Internet
Author: User

The HASHMAP data structure is a hash table (hash table)

HashMap

1) Array: Continuous address, Find Fast, but take up too much memory

2) linked list: Address is not contiguous save space, look for arrays slower, delete and add faster

The advantages of two data structures are assembled

The purpose of the array is to map the address using the hash function based on the key keyword, and this address is stored in the array

The purpose of the linked list is to resolve the conflict because different keywords may be equal according to the hash function mapping address, speaking the latest insert header

hash function :

1), direct addressing method

The value of a linear function that takes a keyword or keyword is a hash address, which is:

H (key) = key or H (key) = A * key + b

Where A and B are constants.

(2), digital analysis method

(3), square value method

The middle of the keyword squared is the hash address.

(4), Folding method

Divide the keywords into parts with the same number of bits (the last part can be different), and then take the overlay of those parts and (rounding up) as the hash address.

(5), in addition to the remainder of the law

The remainder is the hash address of the keyword by a number p that is not larger than the hash table length m, i.e.:

H (key) = key MOD P p≤m

(6), random number method

Select a random function that takes the random function value of the keyword as its hash address, i.e.:

H (key) = random (key)

Where random is the stochastic function.

Handling conflicts

The same hash address may be obtained for different keywords, that is, Key1≠key2, and H (key1) = h (key2), which is called a conflict. A keyword with the same function value is called a synonym for the hash function.

In general, a hash function is a compressed image, which inevitably creates a conflict, so when you create a hash table, you not only have to set a good hash function, but also set a way to handle the conflict.

Common methods of dealing with conflicts are:

(1), open addressing method

hi = (H (key) + di) MOD m i =1,2,..., K (k≤m-1)

where H (key) is a hash function, M is a hash table length, di is an incremental sequence, the following three methods can be used:

1), Di =,..., m-1, called linear detection and re-hashing;

2), Di = 12,-12,22,-22,32,..., ±k2 (K≤M/2), called two-time detection and re-hashing;

3), Di = pseudo-random number sequence, called pseudo-random detection re-hash.

(2), re-hash method

hi = RHI (key) i =,..., k

RHI are different hash functions.

(3), Chain address method

Stores all the data elements of a synonym in the same linear list. Assuming that the hash address produced by a hash function is on the interval [0,m-1], a pointer-type vector void *vec[m] is established, and the initial state of each component is a null pointer. All data elements where the hash address is I are inserted into the linked list with the header pointer Vec[i]. The insertion position in the linked list can be in the table header or footer, or in the middle of the table, to keep synonyms sorted by keyword in the same linear list.

(4) Establishing a public overflow area

In general, the method of dealing with the hash function and the chain address method with the addition of the residue remainder method

  struct Hash_node {

int count;

    struct Hash_node *next;

};

  static int hash (int num)

{

      return num% LEN;

}

  static void collision (struct Hash_node *vec[], int elem, struct Hash_node *new)

{

      if (vec[elem] = = NULL)

Vec[elem] = new;

Else

{

            New--next = Vec[elem];

Vec[elem] = new;

}

}

Hash is a hash function, and the collision function is used to handle conflicts

Data structures and algorithms for HashMap

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.