PostgreSQL source code analysis: Dynamic Hash

Source: Internet
Author: User

1. Why dynamic hash?

The common hash is mostly the following face:

Figure 1 static hash structure

This Hash maintains some buckets, that is, the left part of the graph. Each bucket contains data with the same hash value.

These data with the same hash value form a linked list. One of the main disadvantages of this hash method is that the number of buckets is fixed and not easy to expand. As the number of inserted data increases, the search efficiency decreases sharply.

Dynamic hash is used to solve this problem. The dynamic hash implemented by postgresql ensures that the size of the hash table is dynamically increased when the fill factor does not exceed a predetermined value. At the same time, the changes made to each expansion are not significant, and the space utilization is also relatively high.

2. Dynamic hash structure

The Postgresql Code related to dynamic hash is distributed in the files dynahash. c and hashfn. c. hashfn. c.

It mainly involves some Hash functions, while dynahash. c is the main implementation of dynamic hash.

Compared with normal hash tables, dynamic hash has a new administrative unit: directory. For example:

Figure 2 postgresql dynamic hash structure

Dir is a variable-size array. The initial length can be specified during creation. In the future, the length of each extension will be X2. Each item in dir points to a fixed Segment with the same length and must be an integer power of 2. The element in the Segment array is a Bucket ), each bucket stores a linked list. Dynamic hash stores all elements with the same hash value in the same bucket.

Now let's take a look at the definitions of these basic concepts in pg:

  1. Typedef StructHASHELEMENT
  2. {
  3. StructHASHELEMENT * link;/* Link to next entry in same bucket */
  4. Uint32 hashvalue;/* Hash function result for this entry */
  5. } HASHELEMENT;

 

  1. /* A hash bucket is a linked list of HASHELEMENTs */
  2. TypedefHASHELEMENT * HASHBUCKET;
  3. /* A hash segment is an array of bucket headers */
  4. TypedefHASHBUCKET * HASHSEGMENT;

All these definitions can correspond to each other.

3. How to Find the Bucket corresponding to the given hash value

Let's take a look at the implementation:

  1. /* Convert a hash value to a bucket number */
  2. Static InlineUint32
  3. Calc_bucket (HASHHDR * hctl, uint32 hash_val)
  4. {
  5. Uint32 bucket;
  6. Bucket = hash_val & hctl-> high_mask;
  7. If(Bucket> hctl-> max_bucket)
  8. Bucket = bucket & hctl-> low_mask;
  9. ReturnBucket;
  10. }

Hctl-> max_bucket indicates the total number of buckets minus 1. For Figure 2, this value is 15.

Hctl-> low_mask is the maximum 2 ^ K minus 1 of <= (hctl-> max_bucket + 1). For Figure 2, this value is 16-1 = 15 (0000 1111)

Hctl-> high_mask is 2 ^ (K + 1) minus 1. For Figure 2, this value is 32-1 = 31 (0x0001 1111)

Note that the hctl-> max_bucket changes after the hash table is created. Generally, one is added at a time, if the hctl-> max_bucket becomes the integer power of 2, you need to update hctl-> low_mask and hctl-> high_mask. The update code is as follows:

  1. /* 
  2. * If we crossed a power of 2, readjust masks. 
  3. */
  4. If(Uint32) new_bucket> hctl-> high_mask)
  5. {
  6. Hctl-> low_mask = hctl-> high_mask;
  7. Hctl-> high_mask = (uint32) new_bucket | hctl-> low_mask;
  8. }
  • 1
  • 2
  • Next Page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.