Memory networks principle and its Code analysis

Source: Internet
Author: User
http://blog.csdn.net/u011274209/article/details/53384232 principle: article Source: Memory Networks, answering Reading comprehension Using Memory

For many neural network models, there is a lack of a long memory component for easy reading and writing. As Rnn,lstm and its variants GRU used a certain memory mechanism. These memories are too small for the authors of memory networks, because the state, the output of the cell, and its weights are all embedded in a low dimension, compressing the knowledge into a dense vector and losing a lot of information. This is also the starting point of the article (or Memory series), its approach is simply outrageous, adding an M module. M is an array of objects (an array of objects,for example a array of vectors or an array of strings), in the article, more called slot (slots). Remember a fact (usually a word in a conversation group) and "plug" it into a memory (an array).
A memory network (memory networks, referred to as MEMNN), includes the memory m mentioned above, and includes the following 4 components I, G, O, R (is not super like the three gates of Lstm, and then the list of M like cell):

component name
I input converts the incoming input to the Interna L feature representation.
G generalization updates old memories given the new input. "We Call this generalization as there is a opportunity for the network to compress and generalize their memories at this St Age for some intended future use. "
O output produces a new output in the feature representation spaces given the new Input and the current memory state.
R response converts the output into the response format Desired–for E Xample, a textual response or an action.

I: Used to convert input into a vector within the network. The author uses a simple vector space model, the dimension is 3*lenw+3 (familiar with VSM will ask why not LENW, the following will be specific).
G: Update memory. In the author's specific implementation, simply insert the memory array. The author considers several new situations, though not realized, including the forgetting of Memories, the re organization of Memories.
O: Combine input from memory, extract appropriate memory, and return a vector.
R: Converts the vector back to the desired format, such as text or answer. For R, the simplest is to return directly to the first supporting memory of O, which is a sentence. The author, of course, intends to complicate the point of returning a word W.

The following is a unified formula (you can see that the author has made this into an architecture, not a specific algorithm):

All components are neural networks, called Memory neural Networks (Memnns, multiple s). Basic Model:

This is a basic model of the author's own work, or a simple example of this architecture. The following is a description of this method from the perspective of a memory four component:

I:I input is a phrase that simply converts i to a vector space model of a frequency.
G: Also as above, simply put the vector space model of each sentence in the conversation group into the memory list, where the default memory slot is more than the conversation group sentence. Mn=x, n=n+1. Yes, M, I and G are very simple, that is, the heavy duty on the O and R.
O:o do is to enter a question X, the most appropriate k support memory (the supporting memories, in the following dataset will give examples), that is, top-k. The idea is to traverse the memory array and pick out the largest value. Finally, O returns an array of length K.
For Top1 O1=o1 (x,m) =argmaxi=1,..., NsO (X,MI)
For TOP2 there are O

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.