1. Recurrent neural Network (RNN)
Although the expansion from the multilayer perceptron (MLP) to the cyclic Neural network (RNN) seems trivial, it has far-reaching implications for sequence learning. The use of cyclic neural networks (RNN) is used to process sequence data. In the traditional neural network model, the layer and layer are all connected, and the nodes between each layer are not connected. But this common neural network is powerless for many problems. For example, to predict what the next word of a sentence is, it is generally necessary to use the preceding word, because the words in a sentence are not independent. A cyclic neural network (RNN) refers to the current output of a sequence that is also related to the previous output. The concrete manifestation is that the network will remember the preceding information, keep it in the internal state of the network, and apply to the calculation of the current output, that is, the nodes between the hidden layers are no longer connected but have links, and the input of the hidden layer contains not only the output of the input layer but also the output of the hidden layer at the last moment. In theory, cyclic neural networks can process data of any length, but in practice, in order to reduce complexity, it is assumed that the current state is only relevant to the previous States.
The following figure shows a typical cyclic neural network (RNN) structure.
Figure 1 Cyclic neural network (RNN) structure
An effective way to visualize a cyclic neural network (RNN) is to consider expanding it in time to get the structure shown in Figure 2.
Figure 2 Cyclic neural network (RNN) expands in time
The diagram shows a time-expanded cyclic neural network (RNN) that contains an input unit (input Units), an input set labeled {x0,x1,..., xt−1,xt,xt+1,...}, an output unit (output Units), and an export set marked {y0, Y1,..., yt−1,yt,yt+1,...}, as well as the implied unit (Hidden Units), the output set is marked {s0,s1,..., st−1,st,st+1,...}, and these hidden units do the most important work. In the diagram, the weights connected from the input layer to the hidden layer are marked as u, from the hidden layer to its own connection weights are marked W, and the weights from the hidden layer to the output layer are marked as V. Note that the same weights are reused at each time step. At the same time, in order to be clear, the bias weights are ignored here.
In a recurrent neural network (RNN), there is a one-way flow of information flow from the input unit to the hidden unit, while another one-way flow of information flows from the hidden unit to the output unit. In some cases, the cyclic neural network (RNN) will break the latter limit, the guidance information from the output unit returns the hidden unit, these are called "back projections", and the input of the hidden layer also includes the output of the upper layer of the hidden layer, that is, the nodes in the hidden layer can be self-connected or interconnected.
The calculation process for the network in Figure 2 is as follows:
1) XT means T,t=1, 2, 3, ... The input of the step