Nginx source code analysis (1): Use of hash

Source: Internet
Author: User

An important hash structure is provided in the nginx source code, which can bring us efficient kV search. The implementation of this hash is relatively simple, but it is very efficient. The hash structure is read-only and can only be queried after creation.

The hash struct is hard to understand at first, and it is uncomfortable to use. It requires several structs and several functions to complete initialization and search. In this article, we will not introduce the use of wildcard characters.

 

Let's first look at how to use it.

The process of creating a hash struct is as follows:

1. Construct an array of ngx_hash_key_t members, and use the key, value, and calculated hash value that we need to hash to initialize each member of the array.

2. Construct a variable of the ngx_hash_init_t struct, which contains the ngx_hash_t member. It is a hash struct and includes some other initial settings, such as the bucket size and memory pool. The hash structure is created and initialized in ngx_hash_init.

3. Call ngx_hash_init to input the ngx_hash_init_t structure, the ngx_hash_key_t array, and the length of the array. In this way, the ngx_hash_init_t hash member is the hash structure we want.

Let's look at the search:

1. Calculate the hash value of the key.

2. Use ngx_hash_find for search. Both the hash value and key must be input, and the returned value pointer is used.

 

It seems relatively simple. The sample code is as follows:

# Include <stdio. h> <br/> # include "ngx_config.h" <br/> # include "ngx_conf_file.h" <br/> # include "nginx. H "<br/> # include" ngx_core.h "<br/> # include" ngx_string.h "<br/> # include" ngx_palloc.h "<br/> # include" ngx_array.h "<br /> # include "ngx_hash.h" <br/> volatile ngx_cycle_t * ngx_cycle; <br/> void ngx_log_error_core (ngx_uint_t level, ngx_log_t * log, ngx_err_t err, const char * FMT ,...) {}< br/> static ngx _ Str_t Names [] = {ngx_string ("rainx"), <br/> ngx_string ("xiaozhe"), <br/> ngx_string ("zhoujian ")}; <br/> static char * descs [] = {"rainx's ID is 1", "xiaozhe's ID is 2", "zhoujian's ID is 3 "}; <br/> // basic hash table Operations <br/> int main () <br/> {<br/> ngx_uint_t K; //, P, h; <br/> ngx_pool_t * pool; <br/> ngx_hash_init_t hash_init; <br/> ngx_hash_t * hash; <br/> ngx_array_t * elements; <br/> ngx_hash_key_t * Arr_node; <br/> char * Find; <br/> int I; <br/> ngx_cacheline_size = 32; <br/> // hash key Cal start <br/> ngx_str_t STR = ngx_string ("Hello, world"); <br/> K = ngx_hash_key_lc (Str. data, str. len); <br/> pool = ngx_create_pool (1024*10, null); <br/> printf ("caculated key is % u/N", k ); <br/> // hask key Cal end <br/> // <br/> hash = (ngx_hash_t *) ngx_pcalloc (pool, sizeof (hash )); <br/> hash_init.hash = Hash; // hash structure <br/> hash_init.key = & ngx_hash_key_lc; // hash algorithm function <br/> hash_init.max_size = 1024*10; // max_size <br/> hash_init.bucket_size = 64; // ngx_align (64, ngx_cacheline_size); <br/> hash_init.name = "yahoo_guy_hash "; // <br/> hash_init.pool = pool; // memory pool <br/> hash_init.temp_pool = NULL; <br/> // create an array <br/> elements = ngx_array_create (pool, 32, sizeof (ngx_hash_key_t); <br /> For (I = 0; I <3; I ++) {<br/> arr_node = (ngx_hash_key_t *) ngx_array_push (elements ); <br/> arr_node-> key = (Names [I]); <br/> arr_node-> key_hash = ngx_hash_key_lc (arr_node-> key. data, arr_node-> key. len); <br/> arr_node-> value = (void *) descs [I]; <br/> // <br/> printf ("key: % s, key_hash: % u/N ", arr_node-> key. data, arr_node-> key_hash); <br/>}< br/> If (ngx_hash_init (& hash_init, (ngx_hash_key_t *) Elements-> ELTs, elements-> nelts )! = Ngx_ OK) {<br/> return 1; <br/>}< br/> // search <br/> K = ngx_hash_key_lc (Names [0]. data, Names [0]. len); <br/> printf ("% s key is % d/N", Names [0]. data, k); <br/> Find = (char *) <br/> ngx_hash_find (hash, K, (u_char *) Names [0]. data, Names [0]. len); <br/> If (FIND) {<br/> printf ("Get DESC of rainx: % s/n", (char *) Find ); <br/>}< br/> ngx_array_destroy (elements); <br/> ngx_destroy_pool (pool); <br/> return 0; <br/>} 

Next we will analyze the source code and introduce several structures:

 

 

// Each element in the bucket <br/> typedef struct {<br/> void * value; // The specific stored value, corresponding to value <br/> u_char Len; // Name Length <br/> u_char name [1]; // key corresponding to lower case <br/>} ngx_hash_elt_t; </P> <p> // hash struct <br/> typedef struct {<br/> ngx_hash_elt_t ** buckets; // point to the actual bucket space <br/> ngx_uint_t size; <br/>}ngx_hash_t; </P> <p>/* wildcard */<br/> typedef struct {<br/> ngx_hash_t hash; // hash is included here, therefore, no special <br/> void * value; <br/>} ngx_hash_wildcard_t; </P> <p>/* Kv pair is required for the allocated space, contains the hash value */<br/> typedef struct {<br/> ngx_str_t key; <br/> ngx_uint_t key_hash; <br/> void * value; <br/>} ngx_hash_key_t; </P> <p>/* hash function pointer */<br/> typedef ngx_uint_t (* ngx_hash_key_pt) (u_char * data, size_t Len ); 

// Contains the hash initialization information <br/> typedef struct {<br/> ngx_hash_t * hash; // points to our actual hash struct <br/> ngx_hash_key_pt key; // hash function <br/> ngx_uint_t max_size; // maximum number of elements <br/> ngx_uint_t bucket_size; // The bucket size <br/> char * Name; // The <br/> ngx_pool_t * pool is used in the log; // The Memory Pool <br/> ngx_pool_t * temp_pool; <br/>} ngx_hash_init_t; 

After getting familiar with the struct, I will show you the storage layout of hash in the memory, as shown in the figure:

 

Next, let's look at the complex ngx_hash_init function:

// Initialize a hash struct. The first parameter is some parameters of the hash struct, the second parameter is the kV value array that we need to hash <br/> // The third parameter is the number of elements <br/> // the size of the aligned element <br/> # define ngx_hash_elt_size (name) /<br/> (sizeof (void *) + ngx_align (name)-> key. len + 1, sizeof (void *) <br/> // initialize a hash <br/> ngx_int_t <br/> ngx_hash_init (ngx_hash_init_t * hinit, ngx_hash_key_t * names, ngx_uint_t nelts) <br/>{< br/> u_char * ELTs; <br/> size_t Len; <br/> u_short * test; <br/> ngx_uint _ T I, n, key, size, start, bucket_size; <br/> ngx_hash_elt_t * ELT, ** buckets; <br/> for (n = 0; n <nelts; N ++) {<br/> // The key cannot be greater than 255 <br/> If (Names [N]. key. len> = 255) {<br/> ngx_log_error (ngx_log_emerg, hinit-> pool-> log, 0, <br/> "The/" % v/"value to hash is to long: % uz bytes," <br/> "the maximum length can be 255 bytes only ", <br/> & Names [N]. key, Names [N]. key. len); <br/> return ngx_error; <br />}< Br/> // determine whether the space occupied by each element is smaller than the bucket size <br/> If (hinit-> bucket_size <ngx_hash_elt_size (& Names [N]) + sizeof (void *) <br/> {<br/> ngx_log_error (ngx_log_emerg, hinit-> pool-> log, 0, <br/> "cocould not build the % s, you shoshould" <br/> "increase % s_bucket_size: % I", <br/> hinit-> name, hinit-> name, hinit-> bucket_size); <br/> return ngx_error; <br/>}< br/> // used to record the temporary size of each bucket <br/> test = ngx_alloc (hin It-> max_size * sizeof (u_short), hinit-> pool-> log); <br/> If (test = NULL) {<br/> return ngx_error; <br/>}< br/> // get the actual size of each bucket after removing the pointer <br/> // Why is there an Extra pointer size? This mainly aims to align each element to the pointer <br/> bucket_size = hinit-> bucket_size-sizeof (void *); <br/> // I have not understood it yet !! For more information, see <br/> Start = nelts/(bucket_size/(2 * sizeof (void *); <br/> Start = start? Start: 1; <br/> If (hinit-> max_size> 10000 & hinit-> max_size/nelts <100) {<br/> Start = hinit-> max_size-1000; <br/>}< br/> // the actual number of buckets <br/> for (size = start; Size <pinit-> max_size; size ++) {<br/> ngx_memzero (test, size * sizeof (u_short); <br/> // process each key <br/> for (n = 0; n <nelts; N ++) {<br/> // The Key is null <br/> If (Names [N]. key. data = NULL) {<br/> continue; <br/>}< br/> // obtain the bucket where the key is stored <br/> key = Names [N]. key_hash % size; <br/> // increase the test size of the bucket <br/> test [Key] = (u_short) (test [Key] + ngx_hash_elt_size (& Names [N]); <br/> # If 0 <br/> ngx_log_error (ngx_log_alert, hinit-> pool-> log, 0, <br/> "% UI: % UI/" % v/"", <br/> size, key, test [Key], & Names [N]. key); <br/> # endif <br/> // if the current bucket contains too many elements, it indicates that it should be placed in the next bucket <br/> // After jumping out, continue to execute the loop and add the actual number of buckets <br/> If (test [Key]> (u_short) bucket_size) {<br/> goto next; <br/>}< br/> // the current number of buckets can meet the requirements <br/> goto found; <br/> // increase the number of buckets actually used <br/> next: <br/> continue; <br/>}< br/> // No, the maximum number of buckets cannot meet the actual number of buckets. <br/> ngx_log_error (ngx_log_emerg, hinit-> pool-> log, 0, <br/> "cocould not build the % s, you shocould increase" <br/> "either % s_max_size: % I or % s_bucket_size: % I ", <br/> hinit-> name, hinit-> name, hinit-> max_size, <br/> hinit-> name, hinit-> bucket_size ); <br/> ngx_free (TEST); <br/> return ngx_error; <br/> found: <br/> // <br/> for (I = 0; I <size; I ++) {<br/> test [I] = sizeof (void *); <br/>}< br/> // obtain the actual usage size of each bucket <br/> for (n = 0; n <nelts; n ++) {<br/> If (Names [N]. key. data = NULL) {<br/> continue; <br/>}< br/> key = Names [N]. key_hash % size; <br/> test [Key] = (u_short) (test [Key] + ngx_hash_elt_size (& Names [N]); <br/>}< br/> Len = 0; <br/> // maps the actual size of each bucket to cacheline, obtain the total size of all buckets. <br/> for (I = 0; I <size; I ++) {<br/> If (test [I] = sizeof (void *) {<br/> continue; <br/>}< br/> test [I] = (u_short) (ngx_align (test [I], ngx_cacheline_size )); <br/> Len + = test [I]; <br/>}< br/> If (hinit-> hash = NULL) {<br/> // It seems strange here, since it is hash, why is there no association between the size of the allocated space and the hash struct? <br/> // It is interesting because ngx_hash_wildchard_t contains the hash struct, therefore, <br/> // is allocated together and the pointer of each bucket is also allocated. This kind of thinking is very different from the object-oriented thinking previously learned, however, this will be very efficient <br/> hinit-> hash = ngx_pcalloc (hinit-> pool, sizeof (ngx_hash_wildcard_t) <br/> + size * sizeof (ngx_hash_elt_t *)); <br/> If (hinit-> hash = NULL) {<br/> ngx_free (TEST); <br/> return ngx_error; <br/>}< br/> buckets = (ngx_hash_elt_t **) <br/> (u_char *) hinit-> hash + sizeof (ngx_hash_wildcard_t )); <br/>} else {<br/> buckets = ngx_pcalloc (hinit-> pool, size * sizeof (ngx_hash_elt_t *); <br/> If (buckets = NULL) {<br/> ngx_free (TEST); <br/> return ngx_error; <br/>}< br/> // The memory is aligned to the cache row. <br/> ELTs = ngx_palloc (hinit-> pool, Len + ngx_cacheline_size ); <br/> If (ELTs = NULL) {<br/> ngx_free (TEST); <br/> return ngx_error; <br/>}< br/> ELTs = ngx_align_ptr (ELTs, ngx_cacheline_size); <br/> for (I = 0; I <size; I ++) {<br/> If (test [I] = sizeof (void *) {<br/> continue; <br/>}< br/> // point to the space of each bucket <br/> buckets [I] = (ngx_hash_elt_t *) ELTs; <br/> ELTs + = test [I]; <br/>}< br/> // clear and recalculate <br/> for (I = 0; I <size; I ++) {<br/> test [I] = 0; <br/>}< br/> // corresponding to each kV to the corresponding location in the bucket <br/> for (n = 0; n <nelts; N ++) {<br/> If (Names [N]. key. data = NULL) {<br/> continue; <br/>}< br/> // obtain the current bucket, and the memory address that the current bucket should store <br/> key = Names [N]. key_hash % size; <br/> ELT = (ngx_hash_elt_t *) (u_char *) buckets [Key] + test [Key]); <br/> // set the value and size of the current element <br/> ELT-> value = Names [N]. value; <br/> ELT-> Len = (u_char) Names [N]. key. len; <br/> // obtain the lower-case key and save it to the bucket. <br/> ngx_strlow (ELT-> name, Names [N]. key. data, Names [N]. key. len); <br/> test [Key] = (u_short) (test [Key] + ngx_hash_elt_size (& Names [N]); <br/>}< br/> // set the end element of each bucket to null <br/> for (I = 0; I <size; I ++) {<br/> If (buckets [I] = NULL) {<br/> continue; <br/>}< br/> ELT = (ngx_hash_elt_t *) (u_char *) buckets [I] + test [I]); <br/> ELT-> value = NULL; <br/>}< br/> ngx_free (TEST); <br/> hinit-> hash-> buckets = buckets; <br/> hinit-> hash-> size = size; <br/> return ngx_ OK; <br/>}< br/> 

Ngx_hash_find provides hash search:

// Search. The first parameter is our hash structure, and the second parameter is the hash value generated based on the hash function, <br/> // The third parameter is the key to be searched, and the fourth parameter is the key length <br/> void * <br/> ngx_hash_find (ngx_hash_t * hash, ngx_uint_t key, u_char * Name, size_t Len) <br/>{< br/> ngx_uint_t I; <br/> ngx_hash_elt_t * ELT; <br/> # If 0 <br/> ngx_log_error (ngx_log_alert, ngx_cycle-> log, 0, "Hf:/" % * s/"", Len, name ); <br/> # endif </P> <p> // obtain the bucket that this element may exist. <br/> ELT = hash-> buckets [Key % hash-> size]; <Br/> // If no result is found, null is returned. <br/> If (ELT = NULL) {<br/> return NULL; <br/>}< br/> // traverse each element of the bucket <br/> while (ELT-> value) {<br/> // If the length is incorrect, then find the next <br/> If (Len! = (Size_t) ELT-> Len) {<br/> goto next; <br/>}< br/> // then compare the key <br/> for (I = 0; I <Len; I ++) {<br/> If (name [I]! = ELT-> name [I]) {<br/> goto next; <br/>}< br/> return ELT-> value; <br/> next: <br/> ELT = (ngx_hash_elt_t *) ngx_align_ptr (& ELT-> name [0] + ELT-> Len, <br/> sizeof (void *); <br/> continue; <br/>}< br/> return NULL; <br/>} 

Well, I 'd like to introduce this to you first. Finally, I would like to thank http://code.google.com/p/nginxsrp/wiki/nginxcodereview's help and text!

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.