Cmap Principle and Its Implementation in MFC (the map template does not have the function of sequential traversal)

Source: Internet
Author: User
Tags map class

It will use cmap, And the other basics will be easy to understand.




The ing table class (cmap) is a template class in the MFC collection class, also known as "Dictionary". It is like a table with only two columns, one column is a keyword, And the other column is a data item, they correspond one to one. Keywords
Is unique. Given a keyword, The ing table class will soon find the corresponding data item. The ing table is queried in a hash table, so it is very fast to search for value items in the ing table. For example
All employees in the company have a employee ID and their own name. employee ID is the keyword of the name. Given a employee ID, you can quickly find the corresponding name. The ing Class is most suitable for quick queries based on keywords.
Retrieval scenario.
Common cmap:

Cmapwordtoptr saves the void pointer. the keyword is word.
Cmapptrtoword: Save the word. The keyword is void pointer.
Cmapptrtoptr saves the void pointer. the keyword is another void pointer.
Cmapwordtoob: saves the cobject pointer. the keyword is word.
Cmapstringtoob saves the cobject pointer. the keyword is a string.
Cmapstringtoptr saves the void pointer. the keyword is a string.
Cmapstringtostring stores the string with the Keyword: String

I. Basic knowledge of Map

Map, also known as Dictionary, is a form set of elements consisting of key and value.


Generally, for map, the given key can be used to quickly retrieve the corresponding elements from the unit set. Therefore, when a large amount of data needs to be searched, the search performance occupies an important position.
In this case, map is undoubtedly an ideal container. For example, in MFC, map is used to implement handlemaps and some other internal data structures. In addition
Public map classes are also provided. Use the public map class, MFCProgramYou can easily implement custom mappings based on your needs.

Generally, when a map object is deleted, or when the elements are removed, the keywords and element values are also completely deleted.

From the perspective of Data Structure, typical map operations include:

1. insert element units with given keywords into map.

2. Search for element units with given keywords in map.

3. Delete element units with given keywords in map.

4. enumerate (traverse) All element units in the map.

Various map implementations in MFC provide member functions that implement the preceding operations. For the convenience of the discussion, we will explain it with cmap as the representative.

Once you have inserted a key-Value Pair (key-Value Pair) unit into the map, you can access the map by using the keyword, in this way, you can effectively search, add, or delete element elements, or traverse all elements in the map.


In addition to keyword access methods, cmap in MFC also has a different type-position, which can also be used as an auxiliary method for accessing element units.
Position to "remember" an element unit or enumerate map. You may think that position traversal is equivalent to map traversal using keywords,
In fact, this is not the case. To be exact, the equivalence of the two types of searches is uncertain.

Templates-based cmap classes are provided in MFC. The cmap template class can be used to process specific data types, such as custom classes or struct. In addition, MFC provides a non-Template Class Based on the specified data type, including:

Class Name Keyword type Element value type
Cmapwordtoptr Words Void pointers
Cmapptrtoword Void Pointers words
Cmapptrtoptr Void pointers Void pointers
Cmapwordtoob Words Objects
Cmapstringtoob Strings Objects
Cmapstringtoptr Strings Void pointers
Cmapstringtostring Strings String

II,How map works

The biggest advantage of using map is its excellent performance for fast search. The key to achieving the optimal performance is to try to make
The number of element checks (comparisons) required during the cable cycle reaches the minimum. The performance of sequential search is the worst, because if sequential search is usedAlgorithmFind an element in the Map containing N element units.
(Worst case) requires n independent comparison operations.

Binary Search (compromise search) performs slightly better, but a problem that cannot be ignored is that binary search requires
The query sequence is ordered, which will undoubtedly reduce the flexibility of map operations. In our understanding, the so-called optimal algorithm should be based on the number of element units or the order in which elements are processed.
Sorting and searching can direct to the final fast and efficient algorithm of corresponding elements through simple calculation methods without any additional comparison operations. This sounds a bit mysterious, but in fact,
This algorithm is indeed possible (and, I believe, map can do it ).

In the cmap of MFC and Its Related map classes, as long as the map is correctly set, The lookup function can usually find any element at a time in place, however, it is seldom necessary to perform two or more searches.

How is this efficient search implemented?

This document uses the cmap template class in MFC as an example. After a map is created (usually the moment before the first element is inserted), the system allocates memory for a hash table pointing to the pointer array of the cassoc struct. MFC uses the cassoc struct to describe the combination of element values and keywords.

The cassoc struct is described as follows:

Struct cassoc

{

Cassoc * pnext;

Uint nhashvalue;

Cstring key;

Cstring value;

};


Whenever an element value-a keyword unit is added to a map, a new cassoc struct is created, calculate the corresponding hash based on the actual value of the keyword in the unit.
Value. Copy a pointer to the cassoc struct and insert it to the position where the index value in the hash table is I. The formula for calculating I is as follows:

I = nhashvalue % nhushtablesize

In formula, nhashvalue is the hash value calculated by the actual value of the key keyword; nhashtablesize is the number of elements in the hash table (17 by default ).


If the position where the index value in the hash table is I already contains a cassoc pointer, MFC will create a separate list of cassoc struct, the first
The address of the cassoc struct is stored in the hash table, and the address of the second cassoc struct is stored in the pnext field of the previous cassoc struct, and so on. Demonstrate
A possible implementation of the Greek table. In this hash table, there are 10 elements, five of which are stored with unique element addresses, the other five are stored in two linked lists with two or three lengths.

When calling a map Lookup () function, MFC calculates the corresponding hash value based on the actual value of the input keyword, and then converts the hash value to the index value using the formula mentioned above, and retrieves the cassoc pointer from the corresponding position in the hash table.


Ideally, this position contains only one cassoc pointer, not the cassoc pointer linked list. If, as we expected, a single address corresponds to a single cassoc pointer,
Then, the element unit will be able to be searched in place at a time and read directly. If the retrieved from the hash table is the pointer header address of the cassoc linked list, the MFC sequence compares the cassoc structure of the linked list element.
Contains keywords until the correct results are found. However, as we have discussed earlier, as long as the map is correctly set, there are generally no more than three elements in the linked list, which means that the search can usually be
The three-dimensional element comparison operation is completed.

3. Optimize search efficiency

In the map of MFC, the search performance mainly depends on two factors:

1. Size of the hash table

2. An excellent algorithm that generates unique hash values as much as possible


The size of the hash table is very important for map search performance. For example, if a map contains 1000 element units, but a hash table can only store 17 elements
The space of the cassoc pointer, then, even in the best case, each cassoc linked list in the hash table will also contain 58 or 59 cassoc struct. Naturally, in this case, Lookup
Can be severely hindered.

Hash algorithms are also an important factor affecting search efficiency. If the hash algorithm used can only generate a small number of different hash values (and thus only a small number of different hash table index values), the query performance will also be reduced.

The most effective way to optimize map search performance is to increase the hash table as much as possible to reduce the possibility of conflicts due to the same index value. Microsoft recommends setting the size of the hash table to 110% ~ 120%, so that the application performance of map is balanced between memory consumption and search efficiency.

In MFC, specify the hash table size. You can call the inithashtable () function:

Map. inithashtable (1200 );

If map needs to store 1000 elements, according to Microsoft's recommendation, the size of the hash table is extended to 120% of the actual number of stored elements, that is, the map size is set to 1200.

Statistically, using an odd number as the size of a hash table can also help reduce conflicts. Therefore, the inithashtable () function for initializing a hash table that stores 1000 elements can be used as follows:

Map. inithashtable (1201 );

At the same time, when calling the inithashtable () function, it should be noted that this function should be enabled before the map contains any element. If a map already contains one or more elements, changing the map size will cause assertion errors.


Although the hash algorithm used in MFC can be used in most cases, you can use your own algorithm to replace the original algorithm if you really need it or if you want it. For a single input
To calculate the hash value of a keyword, MFC usually calls a global template function hashkey (). For most data types, the hashkey () function is the following
Implementation:

Afx_inline uint afxapi hashkey (arg_key key)

{

//The default Algorithm in general.

Return (uint) (void *) (DWORD) Key)> 4;

}

But for strings, the specific implementation method is as follows:

Uint afxapi hashkey (lpcwstr key )//Unicode encoded string

{

Uint nhash = 0;

While (* key)

Nhash = (nhash <5) + nhash + * Key ++;

Return nhash;

}

Uint afxapi hashkey (lpcstr key )//ANSI encoded string

{

Uint nhash = 0;

While (* key)

Nhash = (nhash <5) + nhash + * Key ++;

Return nhash;

}

To implement a user-defined hash algorithm corresponding to a specific data type, you can use the hashkey () function of the preceding string version as a reference to write a similar hashkey () of a specific type () function.
4. Use the cmap class in MFC

For the overview of the cmap class in MFC, the above paragraphs have been mentioned one after another, and I will not go into details here. Next, we will list the basic member functions of the cmap class and use a brief program snippet to roughly demonstrate how to use the cmap class.

Constructor:

Cmap Construct a collection class that maps key words and element values.

Operation:

Lookup Search for the corresponding element value using the given keyword.
Setat Insert an element unit into the map. If a key word is matched, replace it.
OPERATOR [] Insert an element-setat sub-operation into map
Removekey Remove element units marked by keywords
Removeall Remove all element units from the map.
Getstartposition Returns the position of the first element unit.
Getnextassoc Read the next element Unit
Gethashtablesize Returns the size of the hash table (number of element units)
Inithashtable Initialize the hash table and specify its size.

Status:

Getcount Returns the number of elements in the map.
Isempty Check whether MAP is empty (no element Unit)

The application example is as follows:

Cmap mymap;

//Initialize the hash table and specify its size (odd number)

. Mymap. inithashtable (257 );

//Add element units to mymap.

For (INT I = 0; I <200; I ++)

Mymap. setat (I, cpoint (I, I ));

//Delete the element unit corresponding to the keyword with an even number.

Position Pos = mymap. getstartposition ();

Int nkey;

Cpoint pt;

While (Pos! = NULL)

{

Mymap. getnextassoc (Pos, nkey, pt );

If (nkey % 2) = 0)

Mymap. removekey (nkey );

}

# Ifdef _ debug

Afxdump. setdepth (1 );

Afxdump <"mymap:" <& mymap <"\ n ";

# Endif

In the above application snippet, we can understand the common usage of the cmap class.

1. First, we use the cmap template class to define an instance-mymap object.

2. Next we need to initialize the size of the hash table of the mymap object. In this case, you should first estimate the potential capacity requirements of mymap, and then select an odd number of values -- or, if possible, the effect of using prime numbers will be better -- as the initial values of the hash table.

3. Then, add element units to mymap.

4. Use mymap to map, search, and traverse data.

5. Call the mymap. removeall () function to remove all elements and release the memory space occupied by mymap.


Cmap corresponds to implement_serial, which allows you to perform serialization and dumping operations on its elements. In
When cmap's independent elements are poured into the environment, you must set the depth of the dump context to 1 or a larger number.

After the above discussion, I believe you have a certain understanding of MAP and Its Implementation in MFC.ArticleIt can help you a little. Also hope friends more contact, more advice, my E-mail: islyb@163.com. Thank you!

Of course, map does not have the sequential traversal function, because map itself is created using a hash table, and of course it cannot restore the sequential row traversal.

But it can be traversed.

For example, cmapptrtoptr

Position Pos = m_maptest.getstartposition ()

Void * rkey = NULL;

Void * rvalue = NULL;

While (POS)

{

M_maptest.getnextassoc (Pos, rkey, Rvalue)

// Do you want

}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.