Comparison and analysis of several common containers hashmap, map, vector, list ... hash table

Source: Internet
Author: User

Transferred from: http://www.haogongju.net/art/1543058

The list supports quick insertions and deletions, but looks time-consuming;

The vector supports quick lookups, but inserts are time consuming.

The time complexity of map lookups is logarithmic, which is almost the fastest and the hash is logarithmic.
If I write myself, I will also use a binary search tree, which in most cases can guarantee the logarithmic complexity, the worst case is the constant complexity, and std::map in any case can guarantee the logarithmic complexity, because it guarantees that the structure is a complete binary search tree, but this will be in the memory of the sacrifice some time.

The map inside the STL is a balanced binary tree, so the nature of the balanced binary tree is available. The time to find the data is also a logarithmic time. vectors, which are generally more efficient than new in allocating memory.

Why do you say Hash_map is on a number of levels? In the case of no collisions, Hash_map is the fastest in all data structures, and it is a constant number.
If a good enough hash algorithm is designed to ensure that the collision rate is very low, the search efficiency of hash_map is undeniable.
In addition, theSTL map, its search is on a number of levels, is the highest in addition to Hash_map , you can say "maybe there is room for improvement", but for 99.9999% of programmers, design a map better than the STL map, I am pessimistic attitude.
The STL map has a balanced strategy (such as a red-black tree, etc.), so it does not degenerate and does not need to consider the distribution of the data itself. However, if the data itself is ordered, the vectors or heap will be significantly faster because their access is relatively straightforward.

I don't think it's necessary to doubt Stl::map's search efficiency, what are the main factors affecting efficiency? Algorithm, in the search problem, what algorithm is better than Rb_tree? At least not yet. Do not deny that you can write code by yourself, design a br-tree that fits your needs, a little bit simpler than stl::map, but at most it is a small line of instructions in each iteration, processing 100,000 data more than 100,000 lines of instruction, which is important to you? If you are not designing an OS like Linux, no one will pay attention to the time spent on these 100,000 lines of instruction.
Rb-tree time is spent on insertions and deletions, and if you are not very efficient in inserting and deleting, you have no reason not to choose Rb-tree based Stl::map.

Most programmers can't write a better map than Std::map, which is of course. However, not all of the features of Std::map appear in our programs, and the programs that we write ourselves can be more appropriate for ourselves, and indeed faster than std::map.

About Hash_map, it and map implementation mechanism is not the same, the map is generally used to implement the tree, its search operation is O (LOGN), this is not controversial, I will not say much.
Hash_map is found internally through a key-to-value arithmetic function, which "accepts only key as a parameter", that is, the Hash_map lookup algorithm is independent of the amount of data, so it is considered O (1) level. Come here should be all the people, you can see the data structure. Of course, the fact is not so perfect, and then lead a paragraph in front of my own words, further explain, lest misunderstanding:

-----------------------------------------
In the case of no collisions, Hash_map is the fastest in all data structures, and it is a constant number.
------------------------------------------
Note my premise: "In the case of non-collision", in other words, is to have a good enough hash function, it should be able to make the key to the value of the mapping is enough uniform, otherwise, in the worst case, its calculation will degenerate to O (N) level, into the same as the list.
If Hash_map is the slowest of all containers, it can only be said that "the worst hash function" makes hash_map the slowest container to find. But that makes little sense, because the most coincidental permutations make bubble sorting the fastest sorting algorithm.

BS: "For large containers, it is common for hash_map to provide an element lookup speed of 5 to 10 times times faster than map, especially where it is particularly important to find speed. On the other hand, if Hash_map chooses a morbid hash function, he may be much slower than map."

Ansic++ no significant changes after 1998 years, and decided not to make any significant changes to the C + + standard library, which is why hash table (including hash_map) is not included in the standard, although it should have a place in the C + + standard.
Although, most of the current compiler platform support hash table, but from the portability aspect, or not hash table good.

Hehe I also come to join in the fun.
1. Sometimes vectors can replace map
For example, if key is an integer, you can define the vector with the span of the key as the length.
When the data scale is large, the difference is staggering. Of course, space waste is often staggering.
2.hash is hard stuff.
Without efficient and low-impact algorithms, HASH_XXX has no meaning.
And for different types, datasets, there can be no good fairy algorithms. must be appropriate for the occasion.
My solution is GP, not rice type, genetic programming, good results.

Your millions data is not very suitable for vectors. Because vectors require contiguous memory space, it is clear that the container will be initialized with a large amount of capacity.
Using map, do you want to create a primary key for it? If there is no such requirement, why not consider deque or list?
The map uses deque as a container by default. In fact, map is not a container, it has little meaning compared with the container. Because you can configure its underlying container type.

If memory is not considered an issue. Use a vector better than a map. Map each time you insert a data, it is sorted once. So the speed is not as fast as inserting all the elements before sorting.

Binary_search is used to search the ordered interval, and if it is random access iterator, it is logarithmic complexity. It can be seen that vectors are better than maps without regard to memory problems.

If you need to insert in the middle of the data, list is the best choice, and the efficiency of vector insertion can make you want to die of pain.

It's better to use a map when it comes to finding it, because the internal data structure of the map is implemented with Rb-tree, and with vectors you can only use linear lookups, which is inefficient.

STL also provides a hash container, theoretically find is fast ~ ~ ~. The vector is a nightmare when you do an orderly insertion, and the map guarantees that it must be sorted by key, and list has to do something for itself.

Hash type of the lookup must be fast, is the mapping relationship, but the insertion and deletion is slow, to do mobile operations, list type of chain relationship, insert very fast, but the lookup is time-consuming, need to traverse ~ ~, or with the list type of bar, although find slow,

Fast sorting, then two-point search, and efficiency is not low

Comparison and analysis of several common containers hashmap, map, vector, list ... hash table

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.