Introduction to recurrent neural networks (RNN, recurrent neural Networks)
This post was reproduced from: http://blog.csdn.net/heyongluoyao8/article/details/48636251
The cyclic neural network (recurrent neural Networks,rnns) has been successfully and widely used in many natural language processing (Natural Language processing, NLP). However, there are few learning materials related to Rnns online, so this series is the introduction of Rnns principles and how to achieve them. The main parts are divided into the following sections to introduce Rnns:
1. Basic introduction of Rnns and some common rnns (contents of this article);
2. Detailed introduction to some of the frequently used training algorithms in Rnns, such as back propagation Through time (BPTT), Real-time recurrent learning (Rtrl), Extended Kalman Filter (EKF) and other learning algorithms, and gradient vanishing problems (vanishing gradient problem)
3. Detailed description of long short-term memory (LSTM, short-and short-term network);
4. Detailed introduction of Clockwork Rnns (Cw-rnns, clock frequency driven recurrent neural network);
5. Rnns is implemented based on Python and Theano, including some common Rnns models.
Unlike traditional Fnns (Feed-forward neural Networks, forward feedback neural networks), Rnns introduces a directional loop that can handle the problems associated with those inputs. The directional loop structure is shown in the following figure:
The tutorial default reader is already familiar with the basic neural network model. If you are unfamiliar, you can click: Implementing A Neural Network from scratch to learn. What is Rnns
The purpose of Rnns is to use to process sequence data. In the traditional neural network model, from the input layer to the hidden layer to the output layer, the layer and the layer are fully connected, the nodes between each layer is not connected. But this common neural network is powerless for many problems. For example, if you want to predict what the next word of a sentence is, you generally need to use the preceding word, because the words in a sentence are not independent. Rnns is called a recurrent neural network, i.e. the current output of a sequence is also related to the previous output. The concrete manifestation is that the network will remember the previous information and apply to the calculation of the current output, that is, the nodes between the hidden layers are no longer connected, but the input of the hidden layer includes not only the output of the input layer but also the output of the hidden layer at the last moment. In theory, Rnns is capable of processing any length of sequence data. In practice, however, in order to reduce complexity it is often assumed that the current state is related only to the previous states, and the following figure is a typical rnns:
From Nature
The Rnns contains the input unit, input units, which is marked {x0,x1,..., xt,xt+1,...}, and the output set of the output unit (outputs units) is marked {y0,y1,..., yt,yt+1.,..}. Rnns also contains hidden cells (Hidden units), and we mark their output set as {s