RNN Encoder-decoder is proposed for machine translation.
Encoder and decoder are two rnn, which are put together for parameter learning to maximize the conditional likelihood function.
Network structure:
Note the input statement is not necessarily the same length as the output statement.
At the encoder end, the hidden state h of the T moment is expressed as the function of the input x at the moment of t-1 H and T, until the input is finished, and the last hidden is considered to be a summary of the sentence, denoted by the context C.
At the decoder end, the hidden state h of the T moment is expressed as the predictive output y of the t-1 moment and the function of the input context C of the h,t-1 moment.
Optimization objectives:
About the calculation of H:
The h of the t moment is the function of the t-1 moment H, which has the reset gate and the update gate to control the memory effect of the length.
Reset gate with Update Gate:
It can be seen that each element of R and Z is computed by a sigmoid function output, which is controlled between 0-1.
Paper notes-learning Phrase representations using RNN Encoder–decoder for statistical machine translation