The recurrent neural network is a single neuron that adds a recurrent loop in addition to the input and output. As the left side, the output state s of the neuron at the previous moment, as an input value for the next moment, is weighted into the input U. This operation makes the output state s of the neurons at a given moment depend on the state of the previous moments St-1,st-2,..., st-n. Thus, we can say that the recurrent path introduces a new dimension to the neural network: Time dimension.
On the right, we see the form of the neuron on Time dimension, where the XT is the input at each point in time, and St is the output state at each point in time, and OT is the output of that neuron at each point in time. In this structure, a total of 3 parameters: U,w,v, respectively, input weight, state weight, and output weight. Like CNN, RNN also has the parameter sharing idea of sharing the three parameters in the time dimension.
The output state St is calculated as:
where f is activation function, can make sigmoid, Tanh, Relu and so on. And at the output, if we use Softmax to predict the probabilities of each output value, then:
Here are three different types of RNN patterns:
The Pattern 1:hidden unit exists recurrent connections, output at each time t, output o at each moment, expected Y and loss function
The Pattern 2:hidden unit exists recurrent connections, and after the entire sequence is read, an output O is generated, and the loss function is calculated based on the expected Y
The Pattern 3:output unit has recurrent Connection for the hidden unit, with output O, expected y, and loss function at every moment
Recurrent neural Network (1): Architecture