Dictionary: Dict.c/dict.h

Source: Internet
Author: User
Tags rehash


Redis source Analysis (1): Dictionaries and hash tables (DICT.C and Dict.h)
http://huangz.iteye.com/blog/1455808
Two points:
The operation flow of the dictionary structure
Progressive rehash operation of a hash table
The hash table is one of the core structures of Redis, where dict.c and dict.h define the hash structure in the Redis source code.


The three core data structures of Dict, dictht and Dictentry

/* Dictionary structure */
typedef struct DICT {
Dicttype *type; A family of functions that are used for values of different types in a hash table
void *privdata; Optional parameters passed to a type-specific function
Dictht ht[2]; Use two hash tables per dictionary
int rehashidx; Indicates whether rehash is in progress, or 1 if not
int iterators; The number of iterator currently in use
} dict;

The comments of the code basically explain the role of the relevant attributes, some of which need to be supplemented are:

Each dictionary uses two hash tables because, to implement the incremental rehash, Redis moves the elements of the No. 0 hash table one by one to the 1th hash table until the No. 0 hash table is emptied.


In addition, the REHASHIDX record is actually the rehash to the index, such as if rehash to the 10th element, then the value of REHASHIDX is 9, and so on, if not in the rehash, the value of REHASHIDX is 1.


Hash table structure--DICTHT structure, this hash table is a separate chaining hash table implementation, which resolves the conflict by placing elements of the same hash value into a list:
typedef struct DICTHT {
Dictentry **table; Array of node pointers
unsigned long size; The number of barrels
unsigned long sizemask; Mask code for address index calculation
unsigned long used; Number of nodes already available
} dictht;

The Table property consists of an array with node pointers as the linked list.

The three properties of size, sizemask, and used look a little dizzy at first, and in fact, they represent:
Size: The number of buckets, that is, the sizes of the table arrays.
Sizemask: This value is calculated by size-1, and after the hash value of the given key is computed, it is & manipulated to determine the element to be placed in the position of the table array.
Used: This value represents the number of elements in the current hash table, that is, how many dictentry structures are stored in the Hashtable altogether.


Linked list node structure
typedef struct DICTENTRY {
void *key; Key
Union {
void *val;
uint64_t U64;
int64_t S64;
} V; VALUES (there can be several different types)
struct Dictentry *next; Point to next hash node (form linked list)
} dictentry;


Dictionary creation Process

After the initial solution of several core data structures, it is time to look at how the relevant functions use these data structures, and let's start by starting with the creation of a dictionary and step through the process of working with dictionaries (and hash tables).

Because the invocation process can give us a high-level view of how the data structure works, without having to get bogged down in the code details, this article gives only some of the core code of the program invocation process, and if you are interested in other details of the code, go to my github to find the annotated version of the code. It has complete code, and I've added a comment to most of the functions.

OK, say it back to the dictionary here, create a new dictionary The call chain that executes is: Dictcreate, _dictinit, _dictreset

Where the Dictcreate function allocates space for the DICT structure, and then passes the new dict to the _dictinit function, allowing it to initialize the associated properties of the DICT structure, and _dictinit calls _dictreset, the HT attribute of the dictionary (which is two ha The set of the constant property for the Greek table.

Note that _dictreset only sets the constant properties for the two hash tables to which the dictionary belongs (size, sizemask, and used), but does not allocate memory for the list of linked tables of the hash table:

static void _dictreset (Dictht *ht)
{
ht->table = NULL;
ht->size = 0;
Ht->sizemask = 0;
ht->used = 0;
}

The process of creating a hash table No. No. 0

We know that a dict structure uses two hash tables, that is, d->ht[0] and d->ht[1], for the convenience of salutation, we call them the No. 0 and 1th hash tables respectively.

As you can tell from the previous section, Dictcreate does not allocate memory for a list of linked tables in a hash table (both d->ht[0]->table and d->ht[1]->table are set to NULL), so when will the list of hash tables be initialized?

The answer is that when you first add an element to the dictionary through Dictadd, the list of No. 0 Hash table arrays is initialized.

Adding an element to the dictionary for the first time executes the following sequence of calls: Dictexpandifneeded, _dictkeyindex, Dictaddraw, Dictadd, Dictexpand

Where Dictadd is the caller of Dictaddraw, Dictaddraw is the underlying implementation of the work of adding elements, and Dictaddraw is calling _dictkeyindex in order to compute the address index of the key for the new element:

Dictentry *dictaddraw (dict *d, void *key)
{
The code that was omitted ...

Calculates the index value of a key
If key already exists, _dictkeyindex returns-1
if (index = _dictkeyindex (d, key)) = = =-1)
return NULL;

The code that was omitted ...
}

_dictkeyindex will then call _dictexpandifneeded to check if two hash tables have space to accommodate the new element before calculating the address index:
static int _dictkeyindex (dict *d, const void *key)
{
The code that was omitted ...

/* Expand The Hashtable if needed */
if (_dictexpandifneeded (d) = = Dict_err)
return-1;

The code that was omitted ...
}

To _dictexpandifneeded this, some interesting things began to happen, _dictexpandifneeded will detect that the No. 0 hash table has not allocated any space, so it calls Dictexpand, incoming Dict_ht_initial_ The size constant, as the initial size of the No. 0 hash table (current version dict_ht_initial_size = 4), allocates space for the No. 0 hash table:
static int _dictexpandifneeded (Dict *d)
{
The code that was ignored ...

/* If The hash table is empty expand it to the intial size. */
if (d->ht[0].size = = 0) return Dictexpand (d, dict_ht_initial_size);

The code that was ignored ...
}

Dictexpand creates a new hash table with an array of linked lists, and then decides whether to assign the Novi Hashi table to the No. 0 hash or the 1th hash.
Here, because the size of our No. 0 hash table is still 0, the first case of the IF statement is executed here, and the Novi Hashi table is assigned to the hash No. 0:
int Dictexpand (dict *d, unsigned long size)
{
The code that was omitted ...

Calculate the (true) size of a hash table
unsigned long realsize = _dictnextpower (size);

Create a new hash table
DICTHT N;
N.size = realsize;
N.sizemask = realsize-1;
n.table = Zcalloc (realsize*sizeof (dictentry*)); Assigning a list of linked arrays
n.used = 0;

is the dictionary number No. 0 hash table initialized?
If not, we will create a new hash table as the dictionary of the No. 0 hash table
if (d->ht[0].table = = NULL) {
D->ht[0] = n;
} else {
Otherwise, a new hash table is created as the 1th hash table of the dictionary and is used for rehash
D->ht[1] = n;
D->rehashidx = 0;
}

The code that was omitted ...
}

Dictionary extensions and creation of the 1th hash table

After the No. 0 hash table was created, we had a dictionary instance that could hold a variety of operations (add, delete, find, and so on).

But here's another question: this originally created No. 0 Hash table is very small (the current version of Dict_ht_initial_size = 4), it will soon be filled with elements, the rehash operation will be activated.

Dictionary: Dict.c/dict.h

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.