A recurrent neural NETWORK without CHAOS

Source: Internet
Author: User

This article introduces a very simple threshold rnn (gated recurrent neural network),

Here are two doors horizontal/forget gate and Vertical/input Gate, i.e.

which (Logistic sigmoid function)

The following assumes that the input data XT meet the following properties,

If the hidden layer node is initialized to 0, that is, the network response to the Pulse XT is,

With attenuation to 0, the forget gate controls the attenuation speed, so when the hidden-layer node HT (i) encounters a strong signal,HT (i) is activated and then attenuated to 0until the next time it is activated again.

Zero input comparison

The model of this paper has only one attractor, zerostate, but other models, i.e., vanilla RNN, the LSTM and the GRU have the c18> chaotic dynamical behavior.

Then the article would like to show that this non-chaotic RNN in Word level language modeling task can also achieve a good effect, indirect explanation of chaos Nature does not explain the success of these models on tasks.

CHAOS in recurrent neural NETWORKS

Consider the following discrete dynamical system, where the vector U belongs to the Rd

The trajectory that is formed will enter the attractor (invariant set) of the system, which is usually a fractal .

All RNN can be written in the following form

Assuming there is no input, the RNN can induce the corresponding dynamical system

Thus, the ability to produce complex trajectories is depicted.

How can the behavior of the power system above be seen? Actually can exist, because the parameter Wj is obtained through the study, when encounters a not important data point xt0, and the hidden layer node has the very weak coupling, namely the data influence is not small, i.e. Wjxt0 ≈0 , the behavior of these dynamical systems will occur over the next period of time until a very important signal is encountered.

Chaotic BEHAVIOR of LSTM and GRU in the absence of INPUT DATA

Consider the following LSTM -induced dynamical system,

Where the parameters are specific,

Then initialize the hidden layer node,

Figure 1 shows the specific dynamic system, the attractor is essentially a 4 -dimensional dynamic system on the 2 -dimensional projection.

The chaotic dynamical system has the initial value sensitivity, given an initial point, the author in [1e-7, 1e7] in the range of disturbance, run steps, a total of 100,000 disturbances. The result is that the No. 200 step is almost filled with the entire attractor .

Above are examples of construction, the following is the author in Penn Treebank Corpus without dropout training good LSTM, the results also appear chaotic phenomenon. When there is an initial entry, it is no longer an autonomous power system, fully received input signal control.

Chaos-free BEHAVIOR of the CFN

Experimental results: the signal attenuation of the hidden layer nodes in the upper layers is slow

A recurrent neural NETWORK without CHAOS

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.