Multi-attention Network for one Shot learning
2018-05-15 22:35:50
The contribution of this article is:
1. Indicates that the category label information can be helpful to one shot learning and design a method to excavate the information;
2. Propose a attention network to produce attention maps for creating the image representation of a exemplar image in novel class Ba SED on its class tag.
3. Further propose a multi-attention scheme to enhance the performance of the model;
4. Two new datasets were collected and an evaluation criterion was built.
The flowchart of this article:
Attention Map Generation:
The calculation of attention value in this article is also dependent on the response between Visual feature and Language feature. The approximate process is as follows:
1. First Use Word embedding method, get the category label of the feature C, and then the feature further learning, you can use the LSTM or FC layer, namely:
Of these, both W and B are model parameters that can be learned.
2. After obtaining the hidden state, we multiply it with visual feature and get a response:
3. Normalization of attention value:
4. Multiply attention values and features to get the weighted feature:
Multi-attention mechanism:
The multi-attention mechanism here is an extension of just that mechanism, with different parameters to get attention value at different angles.
---done!
Multi-attention Network for one Shot learning