C-language implementation of the data structure of the hash table

Source: Internet
Author: User

This blog All source code: GitHub Hosting Project address chain Hash Table Introduction

A chain hash table is fundamentally composed of a set of linked lists. Each list can be viewed as a "bucket," where we put all the elements in a different bucket by hashing them out. When you insert an element, you first pass its key into a hash function (called hashing), which tells the element which bucket it belongs to, and then inserts the element into the corresponding list header until we find the element we want to find. ways to resolve conflicts

1. Open approach

This method is also called the re-hashing method, the basic idea is: when the hash address p=h (key) of the keyword is in conflict, a second hash address is generated based on P P1, if P1 is still in conflict, and then based on p, generate another hash address P2 ..., Until a conflicting hash address pi is found, the corresponding element is deposited into it. This method has a common form of a hash function:

Hi= (H (key) +di)% m i=1,2,...,n

where H (key) is a hash function, M is the table length, and Di is called an increment sequence. the increment sequence is taken in different ways, and the corresponding hashes are different. There are mainly the following three kinds:

L linear probing and hashing

Dii=1,2,3,...,m-1

The feature of this approach is that when a conflict occurs, the next unit of the table is viewed sequentially until an empty cell is found or the entire table is searched.

L Two-time probing and hashing

DI=12,-12,22,-22,...,K2,-K2 (K<=M/2)

This method is characterized by: when the conflict occurs, in the table around the jump-detection, more flexible.

l pseudo random detection and hash

di= pseudo random number sequence.

When implemented, a pseudo random number generator (such as i= (i+p)% m) should be established, and given a random number to do the starting point.

For example, known hash table length m=11, hash function: H (key) = key% 11, then H (=3,h) =4,h (60) = 5, assuming the next keyword is 69, then H (69) = 3, and 47 conflict. If the conflict is handled with a linear probing hash, the next hashing address is h1= (3 + 1)% 11 = 4, still conflict, then find the next hash address is h2= (3 + 2)% 11 = 5, or the conflict, continue to find the next hash address is h3= (3 + 3)% 11 = 6, no longer conflict, 69 fill into unit 5th, parametric diagram 8.26 (a). If you use two probes to hash out the conflict, the next hash address is h1= (3 + 12)% 11 = 4, and the next hash address is h2= (3-12)% 11 = 2, no longer conflicting, 69 is filled in unit 2nd, and the parameter diagram 8.26 (b). If a pseudo random detection hash is used to handle the conflict, and the pseudorandom number sequence is: 2,5,9, ..., the next hash address is h1= (3 + 2)% 11 = 5, still conflict, and then find the next hash address is h2= (3 + 5)% 11 = 8, this time no longer conflict, 69 fill in unit 8th.

2. Re-hashing method

This approach is to construct several different hash functions at the same time:

HI=RH1 (Key) i=1,2,...,k

When the hash address HI=RH1 (key) conflicts, the HI=RH2 (key) is computed ... until the conflict is no longer generated. This method is not easy to generate aggregation, but increases the computational time.

3. Chain Address method

The basic idea of this method is that all the elements of the hash address I are composed of a single linked list called a synonym chain, and the head pointer of a single linked table exists in the first unit of the hash table, so finding, inserting and deleting are mainly done in the synonym chain. The link address method applies to situations where insertions and deletions are frequently performed.

4, the establishment of public overflow zone

The basic idea of this method is: the hash table is divided into the basic table and the overflow table two parts, and the elements that conflict with the basic table are all filled in by the method of constructing the hashtable of overflow tables directly . For example: There is a demographic table from 1-100 years old, in which, the age as a keyword, the hash function takes the keyword itself.
The digital analysis method has the student's birthday data as follows:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.