STD: Summary of map usage

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In order to complete the assignments of the Web Search Course, I struggled for two days to implement the hierarchical clustering HAC algorithm and the clustering algorithm based on Affinity messages. To implement these two algorithms, the first thing is to compute the document vector. Specifically, the index words in the text set constitute a dimension of the vector space. In this way, M index words constitute M-dimensional feature vectors. STD: map is frequently used to construct feature vectors. Because I need to know the probability of this index word in a document. Here are some of my experiences to share with you:

1. OPERATOR []. This [] function is very effective, not only can reference the value corresponding to the key, but also the insert function. Demonstrate a basic usage first:
　　

using namespace std;
...
map<string,int> elem;
....
//insert operation
...
//get inserted value
string keyword;
int freq = elem[keyword];

In this way, the value corresponding to the key in the map can be obtained! What should I do if the keyword I entered does not exist in this map? The [] Insert function is used. If the user fills in a keyword that the map does not have. OPERATOR [] can insert a new pair. And call the constructor of mapped data. Verify with code!

1 struct numidf
2 {
3 int num;
4 bool showup;
5 numidf ()
6 {
7 num = 0;
8 showup = false;
9 cout <"set to 0 and false" <Endl;
10}
11 };
12...
13 Map <string, numidf> m_idf;
14 // Insert elements
15...
16 // query Elements
17 string newkeyword; // The word m_idf does not contain
18 if (! M_idf [newkeyword]. showup)
19 {
20 cout <"construct a new one" <Endl;
21}

If the output of the above Code is

set to 0 and false
construct new one

That is to say, after a new key is input in [], map can automatically add a new pair. The key of the new pair is the entered newkeyword. Mapped data is the instance after initialization. This function is very good. I used to search for the find function first. If the function is new, manually add it. That would be complicated.

2. Use of map iterator
To be honest, I use fewer iterator. So I made several low-level mistakes. I will also remind myself of this article. The error is as follows: I want to implement a function similar to the following code.

vector<int> a;
for(int i = 0; i < a.size()-1; i++)
{
for(int j = i+1, j < a.size(); j++)
    {
//some operation about i and j
    }
}

I want to use iterator to implement the above functions, so I have the following tragic scene:

Map <string, int >:: iterator iteri;
....
// This is wrong!
Int I = 5;
Iteri = iteri + I;
// This is wrong!

I assume that iterator + will jump to the back. It cannot be compiled! A lot of errors have occurred !! Yes !!! So I used the following method:

 1 map<string,int> m_Tree;
 2 map<string,int>::iterator iterI = m_Tree.begin();
 3 map<string,int>::iterator iterJ;
 4 int i = 0;
 5         for( ; i < m_Tree.size()-1; ++iterI,i++)
 6         {
 7             //iterJ = m_Tree.begin();
 8 //advance(iterJ, i+1);
 9             iterJ = iterI;
10             iterJ++;
11             for(; iterJ != m_Tree.end(); iterJ++)
12             {
13                 float s = S((iterI->dvmap),(iterJ->dvmap));
14                 if(s > mostSim)
15                 {//this is the pair
16                     mostSim = s;
17                     sp.s1 = iterI;
18                     sp.s2 = iterJ;
19                 }
20             }
21         }

I want to get the next element pointed to by iteri, so I used the 9-and 10-row method. In fact, lines 7 and 8 of Code are also acceptable, but not as efficient as Lines 9 and 10! If you have a better way to bring it to this function, please let me know!

3. In terms of performance, do not let STD copy the memory. Pass the pointer!
A multi-dimensional document vector is used to calculate document similarity. This large vector is processed using STD: vector. I have noticed two points in terms of performance.
1) use reserve to apply for enough memory. Preparing for push_back
2) Pay attention to push_back. If a vector <float> is declared in the function body. The size of this vector is very large. This is when you want to give it push_back to private members of the class, it is necessary to copy a large amount of memory.
Based on the above two points, I used the following method

1 vector<float> dv;
2 pair<map<string,vector<float> >::iterator,bool> pr;
3 pr = m_TF_IDF.insert(pair<string,vector<float> >(filename, dv));
4 vector<float>& rkdv = pr.first->second;
5 rkdv.reserve(m_IDF.size());

M_tf_idf is a private member of the class. I have inserted an empty vector. Then, the reference of the empty vector is taken out, as shown in row 4th. Then we can use the reference of a large vector to push_back new data, thus eliminating the need for memory replication.
Pointer usage is a fast and efficient implementation method to avoid Memory replication. In my program, I do not know where to use the aforementioned huge vector <float>. To allow anyone who wants to use vector <float> to use it, I passed the pointer of vector <float>. I have defined the following struct:

struct dvPair
{
string names;
    map<string,vector<float>*> dvmap;
};

I passed in the vector <float> pointer instead of the vector <float>!

That's all. No more.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

STD: Summary of map usage

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

STD: Summary of map usage

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support