Simple implementation of HashTable in C Language

Source: Internet
Author: User

HashTable is a very important structure in practical applications. The following describes a simple implementation. Although simple, all of this is still available.


1. access interface

Create a hashtable.

Hashtable hashtable_new (int size) // size indicates the number of contacts contained.

Store key-value to hashtable.

Void hashtable_put (hashtable h, const char * key, void * val );

Extract value from hashtable based on key.

Void * hashtable_get (hashtable h, const char * key );

Release hashtable.

Void hashtable_free (hashtable h );

Release a single hash contact

Void hashtable_delete_node (hashtable h, const char * key );

Ii. Data Structure

Structure of hash contacts:

[Cpp]
Typedef struct hashnode_struct {
Struct hashnode_struct * next;
Const char * key;
Void * val;
} * Hashnode, _ hashnode;
This structure is easy to understand. In addition to the required key-value, it contains a linked list structure for conflict.

Hashtable data structure:

[Cpp]
Typedef struct hashtable_struct {
Pool_t p;
Int size;
Int count;
Struct hashnode_struct * z;
} * Hashtable, _ hashtable;
The structure is described as follows:
Pool_t: memory pool structure to manage the memory used by hashtable. For details about the structure, refer to "C language memory pool usage model"

Size: the size of the current hash point space.

Count: indicates the number of available hash contacts in the current contact space.

Z: used to store contacts in the contact space.

3. Create hashtable

The Code is as follows:

[Cpp]
Hashtable hashtable_new (int size)
{
Hashtable ht;
Pool_t p;
 
P = _ pool_new_heap (sizeof (_ hashnode) * size + sizeof (_ hashtable ));
Ht = pool_malloc (p, sizeof (_ hashtable ));
Ht-> size = size;
Ht-> p = p;
Ht-> z = pool_malloc (p, sizeof (_ hashnode) * prime );
Return ht;
}
This function is relatively simple. First define and initialize a memory pool, and the size depends on the size. Therefore, in actual use, we should allocate a relatively large size point, which is better.

4. Store the key-value

Before this operation, define a function to calculate the hashcode based on the KEY value.

[Cpp]
Static int hashcode (const char * s, int len)
{
Const unsigned char * name = (const unsigned char *) s;
Unsigned long h = 0, g;
Int I;
 
For (I = 0; I <len; I ++)
{
H = (h <4) + (unsigned long) (name [I]); // shifts the hash value four places to the left. The current ASCII character is saved to the hash.
If (g = (h & 0xF0000000UL ))! = 0)
H ^ = (g> 24 );
H & = ~ G; // clear 28-31 characters.
}
 
Return (int) h;
}
This function uses the ELF hash function.
The Code is as follows:

[Cpp]
Void hashtable_put (hashtable h, const char * key, void * val)
{
If (h = NULL | key = NULL)
<Span> </span> return;
 
Int len = strlen (key );
Int index = hashcode (key, len );
Hashtable node;
H-> dirty ++;
 
If (node = hashtable_node_get (h, key, len, index ))! = NULL) // if it already exists, replace it with the current value, because the current comparison is new.
{
N-> key = key;
N-> val = val;
Return;
}
 
Node = hashnode_node_new (h, index); // create a new hash node.
Node-> key = key;
Node-> val = val;
}
Hashtable_node_get is used to find whether the KEY already exists in HASH. The implementation is simple as follows:
[Cpp]
Static hashnode hashtable_node_get (hashtable h, const char * key, int len, int index)
{
Hashnode node;
Int I = index % h-> size;
For (node = & h-> z [I]; node! = NULL; node = node-> next) // search through the HASH bucket corresponding to the index value [HASH value]
If (node-> key! = NULL & (strlen (node-> key) = len) & (strncmp (key, node-> key, len) = 0 ))
Return node;
Return NULL;
}
Create a new hash node contact as follows:
[Cpp]
Static hashnode hashnode_node_new (hashtable h, int index)
{
Hashnode node;
Int I = index % h-> size;
 
H-> count ++;
 
For (node = & h-> z [I]; node! = NULL; node = node-> next)
If (node-> key = NULL) // The processing here is: if a value exists in the HASH bucket, the KEY is empty, indicating that the value is useless, replace it with the new connector to be written.
Return node;
 
Node = pool_malloc (h-> p, sizeof (_ hashnode); // create a contact
Node-> next = h-> z [I]. next; // Add it to the bucket, which is the first contact point added to the linked list.
H-> z [I]. next = node;
Return node;
}

5. Obtain contacts from HASHTABLE
Obtain the contact list from hashtable according to the KEY. The step is to calculate the hash value based on the KEY, and then find the specified contact or contact list from hashtable. As follows:

[Cpp]
Void * hashtable_get (hashtable h, const char * key)
{
If (h = NULL | key = NULL)
<Span> </span> return NULL;
Hashnode node;
Int len = strlen (key );
If (h = NULL | key = NULL | len <= 0 | (node = hashtable_node_get (h, key, len, hashcode (key, len ))) = NULL)
{
Return NULL;
}
Return node-> val;
}
This function is easy to understand.

6. Release HASHTABLE
The release of hashtable is relatively simple. Because all our memory applications are completed in the memory pool, we only need to release the memory pool, as shown below:
View plaincopy
Void hashtable_free (hashtable h)
{
If (h! = NULL)
Pool_free (h-> p );
}

7. release a single hash point
The Code is as follows:

[Cpp]
Void hashtable_delete_node (hashtable h, const char * key)
{
If (h = NULL | key = NULL)
<Span> </span> return;
Hashnode node;
Int len = strlen (key );
If (h = NULL | key = NULL | (node = hashtable_node_get (h, key, len, hashcode (key, len) = NULL) // no such contact
Return;
 
Node-> key = NULL;
Node-> val = NULL;
 
H-> count --;
}

This implements a simple HASHTABLE structure. Of course, there are still some shortcomings, such as traversing HASHTABLE. If we use arrays to traverse, the efficiency will be very low. We will discuss an implementation scheme below, used to traverse hashtable.


8. Discussion on hashtable Traversal

Directly use an array, that is, the struct hashnode_struct array in hashtable can be traversed. However, if only one vertex is contained, all arrays should be traversed as follows:

[Cpp]
Void hashtable_traverse (hashtable h)
{
Int I;
Hashnode node;
If (h = NULL)
Return;
For (I = 0; I For (node = & h-> z [I]; node! = NULL; node = node-> next)
If (node-> key! = NULL & node-> val! = NULL)
XXXXXXXXXXXXXXXXX // here are some operations.
}
This is very inefficient. In fact, the next field is included in the contacts. You can use this to implement traversal.

You need to make a simple change to the previous hashtable data structure and add two domains:

[Cpp]
Typedef struct hashtable_struct {
Pool_t p;
Int size;
Int count;
Struct hashnode_struct * z;
Int bucket;
Hashnode node;
} * Hashtable, _ hashtable;
This is to add the bucket and node fields. The idea of adding these two fields is as follows:
Node indicates the cursor of the current traversal. During the traversal process, the cursor pointing to this vertex is constantly moved.
A bucket is associated with a node and is used to record the bucket of the current node.
First, establish the connection, that is, connect all the contacts. According to the Convention, the XXX_iter_first function is also used. initialize the connection first, as shown below:

[Cpp]
Int hashtable_iter_first (hashtable h ){
If (h = NULL)
<Span> </span> return 0;
H-> bucket =-1;
H-> node = NULL;
Return hashtable_iter_next (h );
}
Hashtable_iter_next is used to obtain the next vertex. If the cursor is determined, the next vertex will be quickly determined, as defined below:
[Cpp]
Int xhash_iter_next (xht h ){
If (h = NULL) return 0;
While (h-> node! = NULL ){
H-> node = h-> node-> next; // move to the next contact. If the contact is valid, success is returned.
If (h-> node! = NULL & h-> node-> key! = NULL & h-> node-> val! = NULL)
Return 1;
}
For (h-> bucket ++; h-> bucket H-> node = & h-> z [h-> bucket];
 
While (h-> node! = NULL ){
If (h-> node-> key! = NULL & h-> node-> val! = NULL)
Return 1;
H-> node = h-> node-> next;
}
}
H-> bucket =-1; // There is no next contact.
H-> node = NULL;
Return 0;
}
With the above two methods, the traversal operation is as follows:
[Cpp]
Hashtable ht
If (hashtable_iter_first (ht) // obtain the first contact.
Do {
// At this time, you can process ht-> node to indicate the current contact.
} While (hashtable_iter_next (ht); // obtain the next contact
If this is done, it will be much more efficient. Of course, at the first time, we still need to traverse the entire array and the contacts in the bucket under the array. However, some operations are required when deleting a node. When deleting a contact, consider whether the current h-> node is the currently deleted contact. If yes, call h-> node to the next contact. After the deletion, perform the following operations. If the deletion is completed.
If the deleted contact is a node, you need to handle it as follows:

[Cpp]
If (h-> node = n)
Hashtable_iter_next (h );
Move h-> node to the next contact.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.