"Python Learning notes-data structures and algorithms" hash table implementation of a hash table

Source: Internet
Author: User
Tags ord

Python's built-in dictionaries are implemented using hash table. Here we are just to deepen our understanding of hash table and hash functions by implementing our own hash table.

"Concept 1:mapping (map)"

The dictionary is indexed by key. A key corresponds to a stored value. Any immutable data type can be used as key.

"Concept 2:hash table (hash table)"

Hash table provides direct access to the data structure in the memory storage location based on key, thus speeding up the lookup speed (O (1)).

is an empty hash Table with size 11, with each element initialized to none:

  

"Conceptual 3:hash function (hash functions)"

The value corresponding to key is stored in the storage location of f (key), and the corresponding relationship F is called Hash Function:

There are many ways to construct hash function:

For example (1) Remainder method (except for the remainder):

Suppose we have an empty hash table, its size (table length) is 11, and now we want to store a series of integers in the hash table key:54, 26, 93, 17, 77, 31; Then the remainder hash function is h (ke Y) =key%11.

The results are as follows:

Now our hash table becomes:

      

"Concept 4:load factor (load factor)" λ=number_of_keys/table_size. In the above example, we store 6 items, the hash table has a length of 11, then its corresponding load factor is 6/11.

When we want to find a key, we just need to use the hash function to figure out the appropriate storage location (slot name) to see if the location stores the key. The time complexity of this search operation is O (1).

"Concept 5:collision or Clash (conflict)": Two items may correspond to the same hash function address, which is called a conflict. For example, when we use remainder hash function, both 44%11 and 77%11 are equal to 0.

In this case, you need to select another hash function, or to process the conflicting results.

Let's start by introducing several other hash function. We want to build a hash function that minimizes the number of collisions, makes it easy to calculate, and distributes evenly in hash table:

(2) Folding method (folding):

Divide the keyword (key) into parts of the same number of bits (the last bits may be different), then sum the parts, and finally do the remainder method.

For example: Our key is a phone number 4365554601, two numbers for a group (43,65,55,46,01), the sum of these parts 43+65+55+46+01=210, and finally seek the remainder 210%11=1. So this phone number is

The hash table is stored in slot 1.

(3) Mid-square method (square take Chinese law):

After the keyword is squared, the number of the numbers between them is used as the hash address.

For example: Our key is 44, first take the square 44^2=1936, then take the middle two digits 93, then take the remainder 93%11=5 that is its hash address.

      

For a non-integer key, such as a string, we can use its ordinal values to calculate:

For example the string ' Cat ', Ord (' C ') =99, Ord (' a ') =97, ord (' t ') = 116, so the corresponding hash address can be (99+97+116)%11 =4

Next, we discuss the method of "conflict handling":

(1) Open Addressing (open addressing method): When a conflict occurs, we find the next empty position in the hash table to hold the conflicting key. Here are two ways to look:

(i) Linear probing (linear probing): the equivalent of probing a hash table one at a time until an empty slot is found and the key is stored in that position. For example, the hash value of the conflict is H, followed by the order of H+1, h+2, h+3 ...

To prevent aggregation (clustering), the skip slots method can be used.

(ii) quadratic probing (square probe): that is, the conflicting hash value is H, then the next lookup is h+1, followed by h+4,h+9,h+16 ...

(2) Chaining (linked list method): All elements of the same hash value are saved in a linked list. However, the more elements that occur in the same location link, the more difficult it is to search. The list is as follows:

    

Finally, we will implement our own hash Table, and it contains the following several methods:

HashTable (size): Establishes a new, empty map. It returns an empty mapping collection, with the initialization table length of size

Put (Key,val): Adds a pair of new key-value to the table. If the key already exists in the table, the value corresponding to this key is updated.

Get (Key): Given a key, returns its corresponding value. Returns none if the key does not exist in the table

Del: Delete the corresponding key-value by using del Map[key].

Len (): Returns the number of key-value pairs stored in the table

In: Use key in map to determine if key is in map, then return true, False if not

  

"Python Learning notes-data structures and algorithms" hash table implementation of a hash table

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.