"Python Learning notes-data structures and algorithms" hash table implementation of a hash table

Last Update:2018-01-20 Source: Internet

Author: User

Tags ord

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Python's built-in dictionaries are implemented using hash table. Here we are just to deepen our understanding of hash table and hash functions by implementing our own hash table.

"Concept 1:mapping (map)"

The dictionary is indexed by key. A key corresponds to a stored value. Any immutable data type can be used as key.

"Concept 2:hash table (hash table)"

Hash table provides direct access to the data structure in the memory storage location based on key, thus speeding up the lookup speed (O (1)).

is an empty hash Table with size 11, with each element initialized to none:

"Conceptual 3:hash function (hash functions)"

The value corresponding to key is stored in the storage location of f (key), and the corresponding relationship F is called Hash Function:

There are many ways to construct hash function:

For example (1) Remainder method (except for the remainder):

Suppose we have an empty hash table, its size (table length) is 11, and now we want to store a series of integers in the hash table key:54, 26, 93, 17, 77, 31; Then the remainder hash function is h (ke Y) =key%11.

The results are as follows:

Now our hash table becomes:

"Concept 4:load factor (load factor)" λ=number_of_keys/table_size. In the above example, we store 6 items, the hash table has a length of 11, then its corresponding load factor is 6/11.

When we want to find a key, we just need to use the hash function to figure out the appropriate storage location (slot name) to see if the location stores the key. The time complexity of this search operation is O (1).

"Concept 5:collision or Clash (conflict)": Two items may correspond to the same hash function address, which is called a conflict. For example, when we use remainder hash function, both 44%11 and 77%11 are equal to 0.

In this case, you need to select another hash function, or to process the conflicting results.

Let's start by introducing several other hash function. We want to build a hash function that minimizes the number of collisions, makes it easy to calculate, and distributes evenly in hash table:

(2) Folding method (folding):

Divide the keyword (key) into parts of the same number of bits (the last bits may be different), then sum the parts, and finally do the remainder method.

For example: Our key is a phone number 4365554601, two numbers for a group (43,65,55,46,01), the sum of these parts 43+65+55+46+01=210, and finally seek the remainder 210%11=1. So this phone number is

The hash table is stored in slot 1.

(3) Mid-square method (square take Chinese law):

After the keyword is squared, the number of the numbers between them is used as the hash address.

For example: Our key is 44, first take the square 44^2=1936, then take the middle two digits 93, then take the remainder 93%11=5 that is its hash address.

For a non-integer key, such as a string, we can use its ordinal values to calculate:

For example the string ' Cat ', Ord (' C ') =99, Ord (' a ') =97, ord (' t ') = 116, so the corresponding hash address can be (99+97+116)%11 =4

Next, we discuss the method of "conflict handling":

(1) Open Addressing (open addressing method): When a conflict occurs, we find the next empty position in the hash table to hold the conflicting key. Here are two ways to look:

(i) Linear probing (linear probing): the equivalent of probing a hash table one at a time until an empty slot is found and the key is stored in that position. For example, the hash value of the conflict is H, followed by the order of H+1, h+2, h+3 ...

To prevent aggregation (clustering), the skip slots method can be used.

(ii) quadratic probing (square probe): that is, the conflicting hash value is H, then the next lookup is h+1, followed by h+4,h+9,h+16 ...

(2) Chaining (linked list method): All elements of the same hash value are saved in a linked list. However, the more elements that occur in the same location link, the more difficult it is to search. The list is as follows:

Finally, we will implement our own hash Table, and it contains the following several methods:

HashTable (size): Establishes a new, empty map. It returns an empty mapping collection, with the initialization table length of size

Put (Key,val): Adds a pair of new key-value to the table. If the key already exists in the table, the value corresponding to this key is updated.

Get (Key): Given a key, returns its corresponding value. Returns none if the key does not exist in the table

Del: Delete the corresponding key-value by using del Map[key].

Len (): Returns the number of key-value pairs stored in the table

In: Use key in map to determine if key is in map, then return true, False if not

"Python Learning notes-data structures and algorithms" hash table implementation of a hash table

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More