[Serialization] How does a relational database work? (6)

Source: Internet
Author: User
Tags database join
Finally, we will introduce the Hash table as an important data structure. It is useful when you need to quickly search, and understanding the Hash table will help us to understand Hashjoin, one of the common database Join methods. This type of data structure is often used by databases to store internal data structures: Table locks or cache pools (which will be described later ). The Hash table can quickly use the element Key

Finally, we will introduce the Hash table as an important data structure. It is useful when you need to quickly search, and understanding the Hash table will help us to understand Hash Join, one of the common database join methods. This type of data structure is often used by databases to store internal data structures: Table locks or cache pools (which will be described later ). The Hash table can quickly use the element Key

Finally, we will introduce the Hash table as an important data structure. It is useful when you need to quickly search, and understanding the Hash table will help us to understand Hash Join, one of the common database join methods. This type of data structure is often used by databases to store internal data structures: Table locks or cache pools (which will be described later ).

The Hash table can quickly find elements through the element Key. To build a Hash table, you need to define:

  • The Key of an element;
  • For a Hash function about Key, the hash value of the Key represents the position of the element (which is usually called a Hash bucket );
  • A comparison function about keys. Once you find the correct bucket, you can use the comparison function to find the correct element.
A simple example

Let's look at a virtual example:

In the Hash table, there are actually 10 buckets. the Hash function is to take the remainder of 10, that is, the single digit of each Key:

  • If the single digit is 0, the element is in the bucket 0;
  • If the single digit is 1, the element is in the bucket 1;
  • If the single digit is 2, the element is in the bucket 2;
  • ...

A comparison function is a function that compares two integers. If we want to find 78:

  • The Hash value of 78 calculated in the Hash table is 8;
  • Find the bucket 8. The first element is 78;
  • 78;
  • The entire search takes two operations: 1-calculate the Hash value; 2-locate the elements in the bucket;

If we want to find 59:

  • The Hash value of 59 calculated in the Hash table is 9;
  • Find the bucket 9. The first element is! = 59, so this is not the element I am looking;
  • Use the same logic to locate 9, 79 ,..., Last 29;
  • Element 59 does not exist;
  • A real search takes 7 operations.
Good Hash function standards

The standard depends on the value you want to search for. The cost of different types of values is different.
If we replace the Hash function in the previous example with the remainder of (that is, the last six digits), the operands consumed in the second example will be reduced to 1, because no element exists in bucket 000059. In fact, the real difficulty is to find a Hash function that can minimize the number of elements in each bucket. (Note: We generally call this method to reduce Hash conflicts.)

In the above two examples, it is easy to find a good Hash function. However, it is difficult to find a Hash function when the Key type is as follows:

  • 1 string, such as a person's name;
  • Two strings, such as a person's surname + name;
  • Two strings and one date, such as a person's surname + name + birth date.

As long as you have a good Hash function, the time complexity of search is O (1 ).

Comparison between arrays and Hash tables

Under what circumstances should an array be used? This is a good question!

  • Hash-based database tables can only load General buckets in the memory, and other buckets can be left on the disk;
  • Arrays must occupy a continuous memory space. If a database table based on a two-dimensional array is large, it is difficult to find enough continuous space in the memory;
  • For Hash-based database tables, you can select any Key. For example, you can select the Key as the country + name.

For more information, refer to another article I wrote in Java HashMap. But understanding this article does not require you to Understand Java.

The next chapter introduces the overall database view.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.