Comparison and Analysis of Several Common Containers

Last Update:2014-09-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Comparison and Analysis of Several Common Containers: hashmap, MAP, vector, list... hash & n List supports fast insertion and deletion, but it takes time to search. Vector supports fast search, but it takes time to insert. The time complexity of map search is logarithm, which is almost the fastest, and hash is also the logarithm. If I write it myself, I will also use the binary search tree, which guarantees the logarithm complexity in most cases, the worst case is the constant complexity, and STD :: in any situation, map can ensure the logarithm complexity, because it ensures that the structure is a completely binary search tree, but it sacrifices some time to store the data. The map in STL is a balanced binary tree, so the balanced binary tree has its own nature. The query time is also the logarithm time. Vector, which is generally more efficient than new in memory allocation. Why does hash_map have a logarithm level? Without collision, hash_map is the fastest query in all data structures. It is constant-level. If a good hash algorithm is designed for the problem to ensure a low collision rate, the search efficiency of hash_map is unquestionable. In addition, for STL map, the search is logarithm-level, which is the highest except hash_map. You can say that "there may be room for improvement", but for 99.9999% of programmers, I am pessimistic when designing a map better than STL map. STL map has a balance policy (such as a red-black tree or something), so it does not degrade and does not need to consider the distribution of data itself. However, if the data itself is sorted, using vector or heap will obviously be faster, because their access is relatively simple. I don't need to worry about STL: the efficiency of map search. What are the main factors that affect efficiency? Algorithms. Is there any better algorithm than rb_tree in search? At least not yet. No, you can write your own code and design a BR-TREE that meets your needs, which is a little simpler than STL: map, but at most one line of instruction is missing in each iteration, is it important for you to execute more than 100,000 lines of commands to process 100,000 pieces of data? If you are not designing an OS like Linux, no one will pay attention to the time spent on these 100,000-line commands. The RB-tree time is spent on insertion and deletion. If you do not have high requirements on insertion and deletion efficiency, you have no reason not to choose STL: Map Based on RB-tree. Most programmers cannot write a better map than STD: map. Of course. However, not all the features of STD: Map appear in our programs. Self-compiled programs can be more suitable for ourselves, which is indeed faster than STD: map. With regard to hash_map, it is different from the implementation mechanism of map. The map is generally implemented by a tree, and its search operation is O (logn). There is no dispute over this, so I won't say much about it. The hash_map search function is implemented internally by an operation function from key to value. This function "only accepts key as a parameter", that is, the hash_map search algorithm is irrelevant to the data volume, so we think it is O (1) level. Here, we should all be experts. You can refer to "Data Structure". Of course, the fact is always not so perfect. Let's further explain what I have mentioned before to avoid misunderstanding: ------------------------------------------- without collision, hash_map is the fastest search in all data structures, it is constant. ---------------------------------------- Pay attention to my premise: "Without collision". In other words, there must be enough hash functions to ensure balanced key-to-value ing. Otherwise, in the worst case, its computing workload degrades to the O (n) level and becomes the same as that of the linked list. If hash_map is the slowest of all containers, it can only be said: "the worst hash function" will make hash_map the slowest container for searching. However, this makes little sense, because the most coincidentally arrangement can make the bubble sort the fastest sorting algorithm. BS: "for large containers, it is common that hash_map can provide an element search speed of 5 to 10 times faster than map, especially when the search speed is particularly important. on the other hand, if hash_map chooses a pathological hash function, it may be much slower than map. "ansic ++ has not changed significantly since 1998 and has decided not to make any major changes to the C ++ standard library. It is precisely for this reason that hash table (including hash_map) it is not included in the standard, although it should have a place in the C ++ standard. Although most of the current compilation platforms support hash tables, we do not need to use hash tables in terms of portability. Hehe is also coming to join us. 1. Sometimes, the vector can replace map. For example, if the key is an integer, the span of the key can be used as the length to define the vector. When the data size is large, the difference is amazing. Of course, space waste is often astonishing. 2. Hash is difficult. There is no efficient and low-collision algorithm, and hash_xxx is meaningless. Different types of datasets cannot have excellent fairy algorithms. It must be appropriate for the occasion. The solution is GP, not rice type, but genetic programming, with good results. It is not appropriate to put your millions of data into a vector. Because the vector requires continuous memory space, it will obviously spend a lot of capacity when initializing this container. Using map, have you thought about creating a primary key for it? If there is no such requirement, why not consider deque or list? By default, map uses deque as the container. In fact, map is not a container. It is of little significance to compare it with the container. Because you can configure its underlying container type. If the memory is not an issue. It is better to use vector than map. Each time a map inserts data, it is sorted once. Therefore, the speed is not as fast as installing all elements before sorting. Binary_search is used to search for ordered intervals. If it is a random access iterator, It is a logarithm complexity. It can be seen that vector is better than map without considering the memory issue. If you need to insert data in the middle, list is the best choice, and the insert efficiency of vector will make you suffer. It is better to use map for search, because the internal data structure of map is implemented using RB-tree, and you can only use linear search with vector, which is very inefficient. STL also provides a hash container. In theory, the search speed is fast ~~~. For ordered inserts, vector is a nightmare, while map is sure to sort by key, and list needs to do something on its own. The hash type can be searched quickly. It is a ing relationship, but insertion and deletion are slow. You need to move the hash type. The list type can be used to perform chained link insertion, which is fast but time-consuming, need to traverse ~~, Or use the list type of it, although the search slow point, First FAST sorting, and then binary search, efficiency is not low from: http://www.cnblogs.com/sharpfeng/archive/2012/09/18/2691096.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More