The map in C ++ STL is implemented by the red and black trees, and the search efficiency is O (lgN). Why does it obtain the constant-level search efficiency by using a hash like python?

Last Update:2018-05-06 Source: Internet

Author: User

Tags key string

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

0 reply content: standard provisions in C ++ STL:
* Map, ordered
* Unordered_map: unordered. This refers to the difference between hashmap and map by means of the hash list. We know that hashmap is of average O (1) and map is of average O (lnN, in practice, is hashmap superior to map? There are several factors to consider:

The memory efficiency of hashmap is lower than that of map, which is obvious.
The search efficiency of map is very high in practice. For example, how many times does it take to search for an element in 1 m of data? 20 times.
The search efficiency of map is more stable than that of hashmap.
Hash is used to search for a hashmap. The worst time complexity is O (M) (M is the length of the key string). If your key is very, very, very ...... Long, the comparison-Based map usually only uses the first few characters for comparison, while the hashmap requires O (M) to calculate the hash
The memory layout will affect the memory locality and performance.

In addition, the red/black tree means that the key value is stored in sequence, which is very convenient when you need to find a range of key values. It is still logN ).
Otherwise, if hash is used, you will have to traverse the entire value space (N ). Most of the time, we don't need this "order". Even in some special circumstances, the "order" we need is not "dynamic and orderly ", here there is a gold_hash_map with a zero overhead that supports sorting. , Common, and faster than std: unordered_xxx and google's spasre_xxx, dense_xxx saves the memory standard that map elements must be ordered. efficiency can be considered only when the standard is met. The hash table cannot guarantee that the elements are ordered, so they cannot be used. Map, stable and orderly;
Hashtable, sometimes faster than map, and sometimes countless times slower than map. Picky eaters (cannot find a good hash), easy to be cheated (intentional data collision ).

For non-real-time scenarios, you can select hashtable, because it is statistically faster than map;
In real-time scenarios, only map can be selected, because it is not as slow as hashtable sometimes.

In Chapter 5th of STL source code analysis, instructor Hou Jie mentioned that hash table and its derivative containers are very important. they were not incorporated into the C ++ standard because the proposal was too late. The next generation of c ++ standard libraries are likely to be included.
Hash table (hash table) and hash_set (hash set) and hash_map (hash ing table) are provided in sgi stl), hash_multiset (hash multi-key set), hash_multimap (hash multi-key ing table ). Unorder_map is also introduced in the new c ++ stl.
Time complexity (Search) and space complexity of map and unordered_map

See stack overflow's answer c ++-Is there any advantage of using map over unordered_map in case of trivial keys?
The answer is:
1. map is ordered and unordered_map is unordered.
2. The search speed is different between the two (log (N) and N)
3. Since hash needs to control the load rate between 0 and 1, unordered_map consumes more space.
Specific time and space consumption:
Hash is an implementation of the ordinary dictionary abstract data type, and RBT is an implementation of the ordered dictionary abstract data type. A common dictionary only supports search, insert, and delete operations. In addition to the three operations, ordered dictionary also supports navigation operations, such as keys closest to a given key, the maximum key and the minimum key.
The implementation of the two types of dictionary has its own application scenarios. Hash table is not popular when C ++ standard library is set. Now the standard library of 11 is added with the hash-based unordered_set and unordered_map.
In addition, the so-called algorithm complexity is not enough. Log time may be a small coefficient, while constant time may be a large coefficient. Try these functions. Next time you have no requirements on the sequence, you will want to use a hash instead of a red/black tree. Try these functions. Next time you have no requirements on the sequence, you will want to use a hash instead of a red/black tree.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More