I am catching up with a project about the certificate holder transaction. I used a lot of operations such as data search and started to use map. My colleagues suggested that I use hash_map to check the internal implementation of hash_map, this is also written here.
Hash_map is based on the hash table ). The biggest advantage of a hash table is that it takes much time to store and search data, which can almost be regarded as a constant time. The cost is only a large amount of memory consumption. However, given the increasing amount of memory available, it is worthwhile to change the space for time. In addition, encoding is easy and one of its features. The basic principle is to use an array with a large subscript range to store elements. You can design a function hash function, also called a hash function), so that the keywords of each element correspond to a function value, that is, an array subscript and a hash value, therefore, this array unit is used to store this element. It can also be simply understood as "classification" for each element based on the keyword ", this element is then stored in the bucket corresponding to the corresponding "class. However, it is not guaranteed that the keywords of each element correspond to the function values one by one. Therefore, it is very likely that the same function value is calculated for different elements, in this way, "Conflict" occurs. In other words, different elements are divided into the same "class. In general, "direct addressing" and "resolving conflicts" are two major features of a hash table. Hash_map first allocates a large block of memory to form many buckets. Is to map keys to buckets in different regions using the hash function. The insert process is as follows: Get the key, get the hash value through the hash function, and get the bucket number (usually the modulo of the hash value on the bucket number). Store the key and value in the bucket. The value process is as follows: Get the key, get the hash value through the hash function, and get the bucket number (generally the modulo of the hash value for the bucket number). Compare whether the internal elements of the bucket are equal to the key, if none of them are equal, they are not found. Obtain the value of the same record. In hash_map, the direct address is generated using the hash function to resolve the conflict, and the comparison function is used to solve the problem. It can be seen that if each bucket has only one element, only one comparison is performed during search. When there are no values in many buckets, many queries will be faster (when not found ). it can be seen that the implementation of the hash table is related to the user: hash function and comparison function. These two parameters are exactly the ones we need to specify when using hash_map. A simple example of using hash_map:
- #if defined(__GNUC__)
- #if __GNUC__ < 3 && __GNUC__ >= 2 && __GNUC_MINOR__ >= 95
- #include
- #elif __GNUC__ >= 3
- #include <ext/hash_map>
- using namespace __gnu_cxx;
- #else
- #include
- #endif
- #elif defined(__MSVC_VER__)
- #if __MSVC_VER__ >= 7
- #include
- #else
- #error "std::hash_map is not available with this compiler"
- #endif
- #elif defined(__sgi__)
- #include
- #else
- #error "std::hash_map is not available with this compiler"
- #endif
- #include <string>
- #include <iostream>
- #include <algorithm>
- using namespace std;
- struct str_hash{
- size_t operator()(const string& str) const
- {
- return __stl_hash_string(str.c_str());
- }
- };
- struct str_equal{
- bool operator()(const string& s1,const string& s2) const {
- return s1==s2;
- }
- };
-
- int main(int argc, char *argv[])
- {
- hash_map<string,string,str_hash,str_equal> mymap;
- mymap.insert(pair<string,string>("hcq","20"));
- mymap["sgx"]="24";
- mymap["sb"]="23";
- cout<<mymap["sb"]<<endl;
- if(mymap.find("hcq")!=mymap.end())
- cout<<mymap["hcq"]<<endl;
- return 0;
- }
Differences between hash_map and map: 1. constructor. Hash_map requires the hash function, which is equal to the function; map only needs the comparison function (less than the function). 2. Storage Structure. Hash_map uses hash table storage, and map generally uses the red/black Tree (RB Tree. Therefore, the memory data structure is different. In general, the query speed of hash_map is faster than that of map, and the basic search speed and data volume size belong to the constant level, while that of map is log (n) level. Not necessarily, constants are smaller than log (n), and the time consumption of hash functions is also time-consuming. See, if you consider efficiency, especially when the number of elements reaches a certain order of magnitude, consider hash_map. However, if you are very strict with the memory usage and want the program to consume as little memory as possible, be careful. hash_map may embarrass you, especially when you have many hash_map objects, you cannot control it, and the construction speed of hash_map is slow. Do you know how to choose? Weigh three factors: search speed, data volume, and memory usage. Haha, don't talk about it, or let the BOSS see that I am not working, that's troublesome ...... Continue to work !!!
This article is from the blog "". For more information, contact the author!