"STL Source Analysis" Reading notes related containers (2)

Source: Internet
Author: User

1.hashtable

Binary search tree has the performance of logarithmic average time, but the performance is constructed on the assumption that the input data is random enough. Hashtable also has the expression of "constant mean Time" in the operation of inserting, deleting and searching, and this kind of performance is based on statistics and does not depend on the randomness of input.

An example of a simple hashtable:

If the element is 32bits instead of 16bits, the array we want to prepare must be 4GB. This is a big unrealistic. How to avoid a large absurd array? One way to do this is to use some kind of mapping function to map large numbers to decimals. is responsible for mapping an element to a "size acceptable index", such a function is called hash function (hash functions). Using a hash function causes multiple elements to be mapped to the same location, and we can resolve the conflict by using linear probing, two probing , and opening the chain.

The STL hash table uses an open-chain approach. That is to maintain a list in each table cell, and then we perform the insertion, search, and deletion of elements on that list.

For the graphical representation of the hash table in the STL:


(1) Hashtable iterator

The Hashtable iterator is the Forward_iterator type, which has only a forward operation and no fallback operation. As it progresses, it tries to move from the current node to a position. If the node is placed within the list, the forward operation can easily be achieved using the node's next pointer. If the current node is at the end of the list, it is transferred to the next bucket, pointing to the head node of the next list.

(2) data structure of Hashtable


Although the open chain does not require that the table size must be prime, the STL still designs the table size with prime numbers, and calculates the 28 prime numbers to be accessible at any time, and provides a function to query the 28 prime numbers that are closest to a given number and are greater than the given number of prime numbers.


(3) Structure and memory management of Hashtable

When we construct a hash table with several nodes, the following constructor is called:



When the insert operation is performed, it is first determined that no expansion table is required. When the number of elements stored in the Hashtable is greater than the size of the bucket vector, it indicates the need to expand the bucket vector, extending to the next larger prime.

To insert a new node without rebuilding the bucket vector, you need to call the Insert_unique_noresize () function.


Use the insert_equal () function if you are allowed to insert duplicate key-value nodes.

How do you tell which bucket the element falls on? This would have been the responsibility of the hash function, and STL had wrapped the task in a layer and handed it to the Bkt_num () function. The reason for this is that many element types cannot be directly modeled on the size of the hashtable, when we need to do some transformations (such as char*).


The hash () function essentially does not handle integer types such as Char,int,long, returning the original value directly, but for a string type, a conversion function is called.

It follows that STL's Hashtable cannot handle categories other than those listed above, such as String,double,float. To work with these categories, users need to define hash function for them.


2.hash_set

Set is the implementation mechanism with Rb-tree as the bottom layer. The STL provides another container in addition to the set: Hash_set, which provides the same interface as the set, except that it is implemented with Hashtable as the underlying mechanism. It is important to note that Hashtable has some types of unhandled (string,double,float, etc.) that Hash_set cannot handle unless the user defines the hash function themselves.

Set is used to quickly search for elements based on key values. This, whether the bottom is rb-tree or Hashtable, can achieve the task. However, if the underlying is Rb-tree, then set has the ability to sort automatically. If the underlying is Hashtable, then automatic sorting cannot be implemented.


3.hash_map

Hash_map is another map implemented at the bottom of rb-tree. use map to quickly search for elements based on key values. This, whether the bottom is rb-tree or Hashtable, can achieve the task. However, if the underlying is Rb-tree, then map has the ability to sort automatically. If the underlying is Hashtable, then automatic sorting cannot be implemented.


4.hash_multiset

The implementation is basically consistent with the hash_set, the only difference is that the set allows duplicate elements, that is, when inserting values into the set, the call is Hashtable insert_equal () instead of Insert_unique ().


5.hash_multimap

The implementation is basically consistent with the hash_map, the only difference is that the map allows duplicate elements, that is, when inserting values into the map, the call is Hashtable insert_equal () instead of Insert_unique ().




"STL Source Analysis" Reading notes related containers (2)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.