What's RNN?
The cyclic neural network, the recurrent neural network, is proposed mainly to deal with sequence data and what sequence data is. is the previous input and the back of the input is related, such as a word, before and after the words are related, "I am hungry, ready to go to XX", according to the input of the previous "XX" is likely to be "eating." This is the sequence data.
There are many variants of the cyclic neural network, such as LSTM, GRU, and so on, where the idea of the underlying circular neural network is clear, which is easier to understand for other variants. difference from traditional neural network
The image below is our classic fully connected network, from the input layer to the two hidden layer to the output layer, the four layers are all connected, and the nodes between the layers are not connected. This network model has little to do with the prediction of sequence data, such as what the next word in a sentence is difficult to deal with.
Circular neural network is good at processing sequence data, it will remember the previous information and participate in the current output calculation, theoretically, the cyclic neural network can handle arbitrary length of the sequence data.
RNN Model
The most abstract representation of the RNN model is the following, but it is not very well understood because it squeezes the time dimension. where x is input, U is the weight of the output layer to the hidden layer, S is the hidden layer value, W is the last time the implied layer as the weight of the input, V is the hidden layer to the output layer weight, O is the output.
To facilitate understanding, the above figure is expanded, and it is now clear that input x, hidden layer value s and output o all have subscript t, this t represents the moment, T-1 is the last time, T+1 is the next moment. At different times the input corresponds to different outputs, and the hidden layer at the last moment affects the output of the current moment.
So what is the reaction to the neuron? As the following figure, this is clearer, the input of 3 neurons to connect 4 hidden layer neurons, and then retain the hidden layer state for the next moment to participate in the calculation.
forward propagation of RNN
Or use this diagram to illustrate, set output layer input as nett, it is easy to get output,
Nett=vst
Ot=σ (nett)
where σ is the activation function, and then the input of the implied layer is HT
Ht