Selection of map and Hash_map containers in STL

Source: Internet
Author: User

Go Selection of map and Hash_map containers in STL

First look at Alvin_lee friends to do the analysis, I think it is still very correct, from the perspective of the algorithm to explain the problem between them!


In fact, this problem is not only encountered in C + +, but the implementation and selection of standard containers in all other languages are considered. You may feel little impact on the application, but be careful about writing algorithms or core code. Improve the code today, by the way, to brush up on the basic lessons.

Remember Herb Sutter that very tasty "C + + Dialogue series", in which "produce a real hash object" This story tells the choice of map. By the way, let me also explain my understanding in the practical.

The map container is selected for quicker search of the relevant objects from the keyword. Compared to using a linear table container such as list, one can simplify the search algorithm, two can make any keyword index, and match the target object, optimize the search algorithm. In C + + STL map is to use the tree to do the search algorithm, the algorithm is almost the same as the list of linear container binary lookup efficiency, is O (log2n), and the list is no map so easy to customize and operation.

The hash table uses the keyword to calculate the table position, compared to hash_map,hash_map using a hash table to arrange the pairing. When the size of the table is appropriate, and the calculation algorithm is appropriate, the algorithm complexity of the hash table is O (1), but this is the ideal case, if the hash table keyword calculation and table position conflict, then the worst complexity is O (n).

So with this understanding, how should we choose the algorithm? The first two days to see the Python article, do not know which Kid said that the Python map than C + + map faster, how how. But he did not know that Python is the default use of the Hash_map, and these language features are essentially written in C/s, and the problem in terms of algorithms and methods, rather than the merits of the language itself, you are familiar with various algorithms, the details of various languages, design ideas, but also in this extreme shouting what is good or bad (One-sided and extreme view of things can only show ignorance and ignorance, everything has the value of existence, including technology). It is obvious that C + + STL uses tree structure to implement map by default.

Tree lookup, in total search efficiency than the hash table, but it is stable, its algorithm complexity will not fluctuate. In a single lookup, you can conclude that it will not be more complex than O (log2n) in the worst case scenario. And the hash table is not the same, is O (1), or O (N), or between it, you can not grasp. If you are developing an interface for external invocation, there is a keyword lookup inside it, but this interface call is not frequent, do you want it to call fast, but not stable, or want its call time average, and stable. Conversely, if your program needs to find a keyword, this operation is very frequent, you want these operations on the overall time is shorter, then the hash table query in total time than the other shorter, the average operating time will be short. There is a need for a balance here.

Here is a summary, the choice of map or hash_map, the key is to see the number of keyword query operations, and you need to ensure that the query overall time or a single query time. If it is to be operated many times, requiring its overall efficiency, then using HASH_MAP, the average processing time is short. If it is a small number of operations, using hash_map may result in an indeterminate O (N), then using a map with a relatively slow average processing time and a constant single processing time, the overall stability should be considered higher than the overall efficiency, because the premise in the number of operations less. If, in a single process, a few operations using HASH_MAP produce a worst case O (N), then the benefits of Hash_map are lost.


Let's look at a piece of code, from CodeProject's Jay kint:


Familiar Month example used
Mandatory contrived example to show a simple point
Compiled using MinGW gcc 3.2.3 with gcc-c-o FILE.O
File.cpp

#include <string>
#include <ext/hash_map>
#include <iostream>

using namespace Std;
Some STL implementations do not put HASH_MAP in Std
using namespace __gnu_cxx;

Hash_map<const char*, int> days_in_month;

Class MyClass {
static int totaldaysinyear;
Public
void add_days (int days) {totaldaysinyear + + days;}
static void Printtotaldaysinyear (void)
{
cout << "Total days in a year is"
<< totaldaysinyear << Endl;
}
};

int myclass::totaldaysinyear = 0;

int main (void)
{
days_in_month["January"] = 31;
days_in_month["February"] = 28;
days_in_month["March"] = 31;
days_in_month["April"] = 30;
Days_in_month["may"] = 31;
days_in_month["June"] = 30;
days_in_month["July"] = 31;
days_in_month["August"] = 31;
days_in_month["September"] = 30;
days_in_month["October"] = 31;
days_in_month["November"] = 30;
days_in_month["December"] = 31;

Error:this Line doesn ' t compile.
Accumulate (Days_in_month.begin (), Days_in_month.end (),
Mem_fun (&myclass::add_days));

MyClass::p rinttotaldaysinyear ();

return 0;
}

Of course, the above code can be fully implemented using STL:


Reference

Standard C + + Solutions
The standard C + + Library defines certain function adaptors, select1st, select2nd and compose1, that can is used to call a Single parameter function with either the key or the data element of a pair associative container.

select1st and select2nd do pretty much what their respective names say they do. They return either the first or second parameter from a pair.

COMPOSE1 allows the use of functional composition, such so the return value of one function can be used as the argument to another. Compose1 (F,G) is the same as f (g (x)).

Using These function adaptors, we can use the for_each to call our function.

Hash_map My_map;
For_each (My_map.begin (), My_map.end (),
Compose1 (Mem_fun (&mytype::d o_something),
select2nd mytype>::value_type> ()));
Certainly, this is much better than have to define helper functions for each pair, but it still seems a bit cumbersome, Especially when compared with the clarity, a comparable for loop have.

for (Hash_map::iterator i =
My_map.begin ();
I! = My_map.end (), ++i) {

I->second.do_something ();
}
Considering it is avoiding the for loop for clarity's sake that inspired the use of the STL algorithms in the first place , it doesn ' t help the case of algorithms vs. hand written loops that the For loop was more clear and concise.

With_data and With_key
With_data and With_key is function adaptors that strive for clarity while allowing the easy use of the STL algorithms wit H pair associative containers. They has same been parameterized much the mem_fun. This isn't exactly rocket science, but it's quickly easy-to-see that they be much cleaner than the standard function ad Aptor expansion using COMPOSE1 and select2nd.

Using With_data and With_key, any function can called and would use the data_type or Key_type as the function ' s argument respectively. This allows hash_map, map, and any other pair associative containers in the STL to be used easily with the standard Algori THMs. It is the even possible to use it with the other function adaptors, such as Mem_fun.

Hash_map my_vert_buffers;

void Releasebuffers (void)
{
Release the vertex buffers created so far.
Std::for_each (My_vert_buffers.begin (),
My_vert_buffers.end (),
With_data (BOOST::MEM_FN (
(&idirect3dvertexbuffer9::release)));
}
Here Boost::mem_fn are used instead of mem_fun since it recognizes the __stdcall methods used by COM, if the Boost_mem_fn_e NABLE_STDCALL macro is defined.

Also add some examples of actual combat:
The connections are:
Http://blog.sina.com.cn/u/4755b4ee010004hm

Excerpts are as follows:


Reference

have been using the STL map, until recently the data volume in the library sharply increased, listening to other students to do the search said Hash_map, has been planning to change back, today did a good experiment to test the HA hash_map function, effect and performance compared to map.
The first thing to say is that both of these data structures provide the ability to store and find Key-value. But the implementation is not the same, map is used by the red-black tree, query time complexity is log (n), and Hash_map is a hash table. Query time complexity can theoretically be a constant, However, a large memory consumption is a way to store time.
As far as the application is concerned, the map is already an STL standard library, but Hash_map has not yet entered the standard library, but it is also a very common and very important library.
This time the test is for the 100W and the file list, go to the heavy performance, that is, the file name string, do map!
Header file used:


#include <time.h>//Compute time Performance
#include <ext/hash_map>//header file with Hash_map
Map of #include <map>//stl
using namespace Std; STD namespace
using namespace __gnu_cxx; And Hash_map is in the __gnu_cxx namespace.

Testing 3 Links: Using the map efficiency, hash_map the efficiency of the system hash function and the efficiency of the self-write hash function.

    struct str_hash{     //self-write hash function
    12      size_t operator () (const string& str) Const
    13     {
    14         unsigned long __h = 0;
    15         for (size_t i = 0; i < str.size (); i + +)
    16         {
    17              __h = 107*__h + str[i];
    18        }
    19         return size_t (__h);
    20    }
   };

//struct str_hash{//self-signed string hash function
//size_t operator () (const string& str) const
25//{
+//Return __stl_hash_string (Str.c_str ());
27//}
28//};

A struct str_equal{//string to judge the equivalence function
-BOOL Operator () (const string& s1,const string& s2) const
32 {
S1==S2 return;
34}
35};

When you use it.
PNS int main (void)
38 {
Vector<string> filtered_list;
Hash_map<string,int,str_hash,str_equal> File_map;
Map<string,int> File2_map;
Ifstream in ("/dev/shm/list");
time_t Now1 = time (NULL);
A. struct TM * CURTIME;
Curtime = LocalTime (&AMP;NOW1);
cout<<now1<<endl;
Char ctemp[20];
Strftime (Ctemp, "%y-%m-%d%h:%m:%s", curtime);
cout<<ctemp<<endl;
string temp;
Wuyi int i=0;
if (!in)
53 {
cout<< "Open failed!~" <<endl;
55}
(in>>temp)
57 {
Sub=temp.substr string (0,65);
if (File_map.find (sub) ==file_map.end ())
if (File2_map.find (sub) ==file2_map.end ())
61 {
File_map[sub]=i;
+//file2_map[sub]=i;
Filtered_list.push_back (temp);
i++;
cout<<sub<<endl;//
67}
68}
In.close ();
cout<< "The total unique file number is:" <<i<<endl;
Ofstream out ("./file_list");
if (!out)
73 {
cout<< "Failed open" <<endl;
75}
* FOR (int j=0;j<filtered_list.size (); j + +)
77 {
out<<filtered_list[j]<<endl;
79}
time_t now2=time (NULL);
Bayi cout<<now2<<endl;
Curtime = LocalTime (&AMP;NOW2);
Strftime (Ctemp, "%y-%m-%d%h:%m:%s", curtime);
cout<<now2-now1<< "\ T" <<ctemp<<endl;
0;
86}


Reference

The conclusion is: (file list has 106W, go to weight after 51W)
1.map time to complete 34 seconds
2.hash_map with the system's own function, takes 22 seconds
3.hash_map with his own function, 14 seconds.
The test results fully illustrate the advantages of hash_map than map, in addition, different hash functions on the performance of the promotion is also different, the above hash function for a classmate, test n more data after the empirical function.
It can be foreseen that the larger the order of magnitude, the more the Hash_map advantage can be demonstrated to!~

Of course the conclusion of the final author is wrong, the hash_map principle is wrong! From the first friend's answer can realize this problem!

Finally, for C++builder users, the following methods should be added:
#include "Stlport\hash_map"
Before you can use hash_map correctly.

This article from Csdn Blog, reproduced please indicate the source:http://blog.csdn.net/skyremember/archive/2008/09/18/2941076.aspx

Selection of map and Hash_map containers in STL

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.