1. Set and Multiset
Features of Set:
(1) All elements are automatically sorted based on the key value of the element.
(2) Set is a set, the key value of its element is the real value, the real value is the key value, does not allow two elements to have the same value.
(3) The value of the element cannot be changed by the iterator of the set, because the element value of the set is the key value, and changing the key value violates the rules of the element arrangement.
(4) The previous iterator is still valid after the client has inserted or deleted the set. Of course, the iterator to the element being removed is an exception.
(5) Its underlying mechanism is rb-tree. Almost all operations simply call the Rb-tree operation.
Multiset and set are almost the same, the only difference being that Multiset allows key values to be duplicated. The set is therefore inserted using the implementation of the underlying Rb-tree insert_unique()
, whereas the multiset inserts are Rb-tree insert_equal()
rather than insert_unique()
.
Test Example:
#include <set>#include <iostream>using namespace STD;intMain () {intia[5] = {0,1,2,3,4};intn =sizeof(IA)/sizeof(ia[0]);cout<<n<<endl;//5 Set<int>Iset (IA, ia+n);cout<<"Size="<<iset.size () << Endl;//Size=5 cout<<iset.count (3) <<endl;//1Iset.insert (3);//Insert not valid, element not allowed to repeat cout<<"Size="<<iset.size () << Endl;//Size=5 cout<<iset.count (3) <<endl;//2Iset.erase (1);cout<<iset.count (1) <<endl;//0 Set<int>:: Iterator Ite1=iset.begin (); Set<int>:: Iterator Ite2=iset.end (); for(; Ite1! = Ite2; ++ite1)cout<<*ite1<<" ";cout<<endl;//0 2 3 4 //associative containers, which should search for elements using the Find function provided by them, will be more //stl algorithm Find () better efficiency! Because STL find () is only a sequential search. Ite1 = Iset.find (3);//Using the Find function provided by itself if(Ite1! = Iset.end ())cout<<"3 found!"<<endl;Else cout<<"3 Not found!"<<endl;//Attempt to change the set element through an iterator is not allowed //*ite1 = 9;//error! return 0;}
2. Map and Multimap
Features of Map:
(1) All elements are automatically sorted based on the key value of the element.
(2) All the elements of map are pair, the first value is the key value, the second is the real value.
(3) Map does not allow two elements to have the same key value.
(4) You can change the real value of an element by using the map's iterator, but you cannot change the key value, which violates the arrangement rules of the element.
(5) The previous iterator is still valid after the client has inserted or deleted the map. Of course, the iterator to the element being removed is an exception.
(6) Its underlying mechanism is rb-tree. Almost all operations simply call the Rb-tree operation.
Multimap and map are almost the same, the only difference is that Multimap allows key values to be duplicated. So map is inserted using the implementation of the underlying Rb-tree insert_unique()
, whereas the multimap inserts are Rb-tree insert_equal()
rather than insert_unique()
.
3, Hashtable
1. Hashtable overview
Hashtable can provide access and delete operations on any of the known items, which are intended to provide a basic operation of constant time, rather than relying on the randomness of the inserted element, which is based on statistics.
Hash function: is responsible for mapping an element to a "size acceptable index". In short, it is the mapping of large numbers to decimals.
Problems with hash function: There may be different elements mapped to the same location (the same index). This is a collision or conflict problem. How to solve the collision problem: 线性探测(linear probing),二次探测(quadratic probing),开链(separate chaining)
etc. The hash method used by STL Hashtable is the open chain method.
(1) Linear detection: When the hash function calculates the insertion position of an element, and the location space is no longer available, how do you do it? The simplest approach is to look for it sequentially, knowing that you can find a free space.
Two assumptions are required: a. The table is large enough. B. Each element can be independent.
Linear probing can cause the main group (primary clustering) problem: The average insertion cost growth rate is much higher than the load factor of the growth range.
(2) Two probes: mainly used to solve the main group problem. The equation for solving the collision is F(i) = i^2
. If the hash function calculates the position of the new element as H, and the position is actually already in use, then try it sequentially H+1^2,H+2^2,H+3^2,H+4^2,....,H+i^2
, rather than the linear probe attempt H+1,H+2,H+3,H+4,....,H+i
.
Two probes can eliminate the main group, but may cause the sub-group (secondary clustering): Two elements calculated by the hash function if the same position, the insertion of the same position is detected, the formation of some kind of waste. Elimination of sub-group methods such as duplex hashing.
(3) Open chain: This practice is to maintain a list in each of the table elements. The hash function assigns a list to us, and then we perform the insertion, search, and deletion of the elements on the list. If the list is short enough, the speed is still fast enough.
Using the open chain method, the load factor for the table will be greater than 1. SGI STL Hashtable uses the open chain method.
2. Hashtable structure
Each cell in the Hashtable table covers more than just a node (element), but may be a bucket of nodes, so called buckets.
In SGI STL, hash table uses the open-chain method for conflict processing, and its structure:
The linked list maintained by buckets does not use the STL list or slist, but instead maintains the hash table node itself. As for buckets aggregates, vectors are used to achieve dynamic capacity expansion.
The hash iterator in STL is a forward iterator that can only be +. There is a pointer to the current node and a pointer to the corresponding vector, there is no fallback operation, that is, there is no so-called reverse iterator.
Hashtable the size of the table with prime numbers , pre-calculated 28 prime numbers, ready to access, about twice times the relationship increment, and provide a function to query 28 prime numbers, "the closest to a certain number and greater than a certain number" of prime numbers as the length of the vector , and if redistribution is required, the vector of the next prime number length is assigned.
The STL Hash Table expansion table is triggered when the number of elements is greater than or equal to the size of the table. (This condition should be to ensure that the constant operating time, based on the statistical basis).
Insert is divided into insert_unique
and insert_equal
manipulated, the former guarantees that the number of insertions cannot be duplicated, the latter can insert the same number of key values. You can use the unique equal before using it. insert_unique:
Call the Resize function first to see if you need to increase the vector, then insert, and the index of the vector is obtained by taking the remainder. Resize: If the number of elements is greater than the size of the vector, you need to allocate new space according to the latest prime numbers, the elements in the old space, recalculate the hash, copy to the new space, the last old space and the new space swap. It insert_equal:
is also called resize, which iterates through the same node and inserts it in front of the node.
Hashtable have some types that can't be handled, such as String,double,float. Unless the user writes the corresponding hash function for those types.
4, Hash_set,hash_map,hash_multiset,hash_multimap
These containers correspond to the one by one described earlier, except that these are implemented using Hash_tabel as the underlying mechanism.
The underlying mechanism determines the difference between these two sets of containers:
The Rb-tree group sorts the elements, while the Hashtable group does not;
The search time complexity of Rb-tree Group is LG (N), while the Hashtable group is constant time;
The Rb-tree group will not waste nodes in space utilization, while the Hashtable group may have some empty buckets.
hash_multiset
And hash_multimap
inserts are using the inset_equal. Other operations are the same as Multiset and Multimap.
"STL Source Analysis" Study notes-5th Chapter associative container (ii)