Nginx source code analysis (1): Use of hash

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

An important hash structure is provided in the nginx source code, which can bring us efficient kV search. The implementation of this hash is relatively simple, but it is very efficient. The hash structure is read-only and can only be queried after creation.

The hash struct is hard to understand at first, and it is uncomfortable to use. It requires several structs and several functions to complete initialization and search. In this article, we will not introduce the use of wildcard characters.

Let's first look at how to use it.

The process of creating a hash struct is as follows:

1. Construct an array of ngx_hash_key_t members, and use the key, value, and calculated hash value that we need to hash to initialize each member of the array.

2. Construct a variable of the ngx_hash_init_t struct, which contains the ngx_hash_t member. It is a hash struct and includes some other initial settings, such as the bucket size and memory pool. The hash structure is created and initialized in ngx_hash_init.

3. Call ngx_hash_init to input the ngx_hash_init_t structure, the ngx_hash_key_t array, and the length of the array. In this way, the ngx_hash_init_t hash member is the hash structure we want.

Let's look at the search:

1. Calculate the hash value of the key.

2. Use ngx_hash_find for search. Both the hash value and key must be input, and the returned value pointer is used.

It seems relatively simple. The sample code is as follows:

# Include <stdio. h> # include "ngx_config.h" # include "ngx_conf_file.h" # include "nginx. H " # include" ngx_core.h " # include" ngx_string.h " # include" ngx_palloc.h " # include" ngx_array.h " # include "ngx_hash.h" volatile ngx_cycle_t * ngx_cycle; void ngx_log_error_core (ngx_uint_t level, ngx_log_t * log, ngx_err_t err, const char * FMT ,...) {} static ngx _ Str_t Names [] = {ngx_string ("rainx"), ngx_string ("xiaozhe"), ngx_string ("zhoujian ")}; static char * descs [] = {"rainx's ID is 1", "xiaozhe's ID is 2", "zhoujian's ID is 3 "}; // basic hash table Operations int main () { ngx_uint_t K; //, P, h; ngx_pool_t * pool; ngx_hash_init_t hash_init; ngx_hash_t * hash; ngx_array_t * elements; ngx_hash_key_t * Arr_node; char * Find; int I; ngx_cacheline_size = 32; // hash key Cal start ngx_str_t STR = ngx_string ("Hello, world"); K = ngx_hash_key_lc (Str. data, str. len); pool = ngx_create_pool (1024*10, null); printf ("caculated key is % u/N", k ); // hask key Cal end // hash = (ngx_hash_t *) ngx_pcalloc (pool, sizeof (hash )); hash_init.hash = Hash; // hash structure hash_init.key = & ngx_hash_key_lc; // hash algorithm function hash_init.max_size = 1024*10; // max_size hash_init.bucket_size = 64; // ngx_align (64, ngx_cacheline_size); hash_init.name = "yahoo_guy_hash "; // hash_init.pool = pool; // memory pool hash_init.temp_pool = NULL; // create an array elements = ngx_array_create (pool, 32, sizeof (ngx_hash_key_t); For (I = 0; I <3; I ++) { arr_node = (ngx_hash_key_t *) ngx_array_push (elements ); arr_node-> key = (Names [I]); arr_node-> key_hash = ngx_hash_key_lc (arr_node-> key. data, arr_node-> key. len); arr_node-> value = (void *) descs [I]; // printf ("key: % s, key_hash: % u/N ", arr_node-> key. data, arr_node-> key_hash); } If (ngx_hash_init (& hash_init, (ngx_hash_key_t *) Elements-> ELTs, elements-> nelts )! = Ngx_ OK) { return 1; } // search K = ngx_hash_key_lc (Names [0]. data, Names [0]. len); printf ("% s key is % d/N", Names [0]. data, k); Find = (char *) ngx_hash_find (hash, K, (u_char *) Names [0]. data, Names [0]. len); If (FIND) { printf ("Get DESC of rainx: % s/n", (char *) Find ); } ngx_array_destroy (elements); ngx_destroy_pool (pool); return 0; }

Next we will analyze the source code and introduce several structures:

// Each element in the bucket typedef struct { void * value; // The specific stored value, corresponding to value u_char Len; // Name Length u_char name [1]; // key corresponding to lower case } ngx_hash_elt_t; // hash struct typedef struct { ngx_hash_elt_t ** buckets; // point to the actual bucket space ngx_uint_t size; }ngx_hash_t; /* wildcard */ typedef struct { ngx_hash_t hash; // hash is included here, therefore, no special void * value; } ngx_hash_wildcard_t; /* Kv pair is required for the allocated space, contains the hash value */ typedef struct { ngx_str_t key; ngx_uint_t key_hash; void * value; } ngx_hash_key_t; /* hash function pointer */ typedef ngx_uint_t (* ngx_hash_key_pt) (u_char * data, size_t Len );

// Contains the hash initialization information typedef struct { ngx_hash_t * hash; // points to our actual hash struct ngx_hash_key_pt key; // hash function ngx_uint_t max_size; // maximum number of elements ngx_uint_t bucket_size; // The bucket size char * Name; // The ngx_pool_t * pool is used in the log; // The Memory Pool ngx_pool_t * temp_pool; } ngx_hash_init_t;

After getting familiar with the struct, I will show you the storage layout of hash in the memory, as shown in the figure:

Next, let's look at the complex ngx_hash_init function:

// Initialize a hash struct. The first parameter is some parameters of the hash struct, the second parameter is the kV value array that we need to hash // The third parameter is the number of elements // the size of the aligned element # define ngx_hash_elt_size (name) / (sizeof (void *) + ngx_align (name)-> key. len + 1, sizeof (void *) // initialize a hash ngx_int_t ngx_hash_init (ngx_hash_init_t * hinit, ngx_hash_key_t * names, ngx_uint_t nelts) { u_char * ELTs; size_t Len; u_short * test; ngx_uint _ T I, n, key, size, start, bucket_size; ngx_hash_elt_t * ELT, ** buckets; for (n = 0; n <nelts; N ++) { // The key cannot be greater than 255 If (Names [N]. key. len> = 255) { ngx_log_error (ngx_log_emerg, hinit-> pool-> log, 0, "The/" % v/"value to hash is to long: % uz bytes," "the maximum length can be 255 bytes only ", & Names [N]. key, Names [N]. key. len); return ngx_error; } // determine whether the space occupied by each element is smaller than the bucket size If (hinit-> bucket_size <ngx_hash_elt_size (& Names [N]) + sizeof (void *) { ngx_log_error (ngx_log_emerg, hinit-> pool-> log, 0, "cocould not build the % s, you shoshould" "increase % s_bucket_size: % I", hinit-> name, hinit-> name, hinit-> bucket_size); return ngx_error; } // used to record the temporary size of each bucket test = ngx_alloc (hin It-> max_size * sizeof (u_short), hinit-> pool-> log); If (test = NULL) { return ngx_error; } // get the actual size of each bucket after removing the pointer // Why is there an Extra pointer size? This mainly aims to align each element to the pointer bucket_size = hinit-> bucket_size-sizeof (void *); // I have not understood it yet !! For more information, see Start = nelts/(bucket_size/(2 * sizeof (void *); Start = start? Start: 1; If (hinit-> max_size> 10000 & hinit-> max_size/nelts <100) { Start = hinit-> max_size-1000; } // the actual number of buckets for (size = start; Size <pinit-> max_size; size ++) { ngx_memzero (test, size * sizeof (u_short); // process each key for (n = 0; n <nelts; N ++) { // The Key is null If (Names [N]. key. data = NULL) { continue; } // obtain the bucket where the key is stored key = Names [N]. key_hash % size; // increase the test size of the bucket test [Key] = (u_short) (test [Key] + ngx_hash_elt_size (& Names [N]); # If 0 ngx_log_error (ngx_log_alert, hinit-> pool-> log, 0, "% UI: % UI/" % v/"", size, key, test [Key], & Names [N]. key); # endif // if the current bucket contains too many elements, it indicates that it should be placed in the next bucket // After jumping out, continue to execute the loop and add the actual number of buckets If (test [Key]> (u_short) bucket_size) { goto next; } // the current number of buckets can meet the requirements goto found; // increase the number of buckets actually used next: continue; } // No, the maximum number of buckets cannot meet the actual number of buckets. ngx_log_error (ngx_log_emerg, hinit-> pool-> log, 0, "cocould not build the % s, you shocould increase" "either % s_max_size: % I or % s_bucket_size: % I ", hinit-> name, hinit-> name, hinit-> max_size, hinit-> name, hinit-> bucket_size ); ngx_free (TEST); return ngx_error; found: // for (I = 0; I <size; I ++) { test [I] = sizeof (void *); } // obtain the actual usage size of each bucket for (n = 0; n <nelts; n ++) { If (Names [N]. key. data = NULL) { continue; } key = Names [N]. key_hash % size; test [Key] = (u_short) (test [Key] + ngx_hash_elt_size (& Names [N]); } Len = 0; // maps the actual size of each bucket to cacheline, obtain the total size of all buckets. for (I = 0; I <size; I ++) { If (test [I] = sizeof (void *) { continue; } test [I] = (u_short) (ngx_align (test [I], ngx_cacheline_size )); Len + = test [I]; } If (hinit-> hash = NULL) { // It seems strange here, since it is hash, why is there no association between the size of the allocated space and the hash struct? // It is interesting because ngx_hash_wildchard_t contains the hash struct, therefore, // is allocated together and the pointer of each bucket is also allocated. This kind of thinking is very different from the object-oriented thinking previously learned, however, this will be very efficient hinit-> hash = ngx_pcalloc (hinit-> pool, sizeof (ngx_hash_wildcard_t) + size * sizeof (ngx_hash_elt_t *)); If (hinit-> hash = NULL) { ngx_free (TEST); return ngx_error; } buckets = (ngx_hash_elt_t **) (u_char *) hinit-> hash + sizeof (ngx_hash_wildcard_t )); } else { buckets = ngx_pcalloc (hinit-> pool, size * sizeof (ngx_hash_elt_t *); If (buckets = NULL) { ngx_free (TEST); return ngx_error; } // The memory is aligned to the cache row. ELTs = ngx_palloc (hinit-> pool, Len + ngx_cacheline_size ); If (ELTs = NULL) { ngx_free (TEST); return ngx_error; } ELTs = ngx_align_ptr (ELTs, ngx_cacheline_size); for (I = 0; I <size; I ++) { If (test [I] = sizeof (void *) { continue; } // point to the space of each bucket buckets [I] = (ngx_hash_elt_t *) ELTs; ELTs + = test [I]; } // clear and recalculate for (I = 0; I <size; I ++) { test [I] = 0; } // corresponding to each kV to the corresponding location in the bucket for (n = 0; n <nelts; N ++) { If (Names [N]. key. data = NULL) { continue; } // obtain the current bucket, and the memory address that the current bucket should store key = Names [N]. key_hash % size; ELT = (ngx_hash_elt_t *) (u_char *) buckets [Key] + test [Key]); // set the value and size of the current element ELT-> value = Names [N]. value; ELT-> Len = (u_char) Names [N]. key. len; // obtain the lower-case key and save it to the bucket. ngx_strlow (ELT-> name, Names [N]. key. data, Names [N]. key. len); test [Key] = (u_short) (test [Key] + ngx_hash_elt_size (& Names [N]); } // set the end element of each bucket to null for (I = 0; I <size; I ++) { If (buckets [I] = NULL) { continue; } ELT = (ngx_hash_elt_t *) (u_char *) buckets [I] + test [I]); ELT-> value = NULL; } ngx_free (TEST); hinit-> hash-> buckets = buckets; hinit-> hash-> size = size; return ngx_ OK; }

Ngx_hash_find provides hash search:

// Search. The first parameter is our hash structure, and the second parameter is the hash value generated based on the hash function, // The third parameter is the key to be searched, and the fourth parameter is the key length void * ngx_hash_find (ngx_hash_t * hash, ngx_uint_t key, u_char * Name, size_t Len) { ngx_uint_t I; ngx_hash_elt_t * ELT; # If 0 ngx_log_error (ngx_log_alert, ngx_cycle-> log, 0, "Hf:/" % * s/"", Len, name ); # endif // obtain the bucket that this element may exist. ELT = hash-> buckets [Key % hash-> size]; // If no result is found, null is returned. If (ELT = NULL) { return NULL; } // traverse each element of the bucket while (ELT-> value) { // If the length is incorrect, then find the next If (Len! = (Size_t) ELT-> Len) { goto next; } // then compare the key for (I = 0; I <Len; I ++) { If (name [I]! = ELT-> name [I]) { goto next; } return ELT-> value; next: ELT = (ngx_hash_elt_t *) ngx_align_ptr (& ELT-> name [0] + ELT-> Len, sizeof (void *); continue; } return NULL; }

Well, I 'd like to introduce this to you first. Finally, I would like to thank http://code.google.com/p/nginxsrp/wiki/nginxcodereview's help and text!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More