Lstm combing, understanding, and Keras realization (i)

Last Update:2016-12-05 Source: Internet

Author: User

Tags keras

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Note: This article is mainly in http://colah.github.io/posts/2015-08-Understanding-LSTMs/this article based on the understanding written, may also be called the understanding of Understanding LSTM Network. Thanks to the author for his selfless sharing and the popular and accurate explanation.

I. RNN

When it comes to lstm, it is inevitable to mention the simplest and most primitive rnn first. In this part, my goal is simply to understand the word "loop" in the "Recurrent neural network" and not to throw out any formulas, by the way, by mentioning the input data format that once puzzled my keras.

We can often see that lstm is suitable for sequential sequences and variable-length sequences, especially for natural language processing. So what gives it the ability to handle variable-length sequences? In fact, as long as careful study, I believe that everyone can have an intuitive answer.

From the left side of the picture, RNN has two inputs, one is the input XT for the current T moment, and the other is an input that looks like "itself".

This is not very clear, look at the picture to the right: Actually, the right is a left-hand image on the time series of the expansion, the last moment output is the input of this moment. It is important to note that, in fact, all neurons on the right are the same neuron, the left, which share the same weights, but accept different inputs at each moment, and then output to the next moment as input. This is the information stored in the past.

Understanding the meaning of "loops" is the purpose of this chapter, and the formulas and details are described in detail in lstm.

Keras Chinese Document: http://keras-cn.readthedocs.io/en/latest/layers/recurrent_layer/(Chinese document really good, in addition to the content of translation, but also add additional content, For example tensor, the concept of batch size helps the DL novice understand)

In all RNN, including Simplernn, LSTM, GRU, and so on, the input and output data formats are as follows:

The input is a three-dimensional vector. Samples is the number of data bars. It is difficult to understand Timesteps and Input_dim. Input_dim is the dimension of the representation of the data, and Timestep is the total number of time steps. For example such a data, a total of 100 sentences, 20 words per sentence, each word is represented by a 80-dimensional vector. In RNN, each timestep input is a word (not necessarily, you can also tune to two words or other), from the first picture of the RNN, T0 moment is the first time step, x0 is to represent the first word in a sentence 80-dimensional vector, T1 is the second time step, X1 represents a 80-dimensional vector of the second word in a sentence ... So the size of the input data should be (100, 20, 80)

Note: The sentence length in practice is not exactly the same, but judging from the RNN workflow, it can handle variable-length sequences. In Kera, you can first set the sentence to the maximum length, less than the length of the sentence up to 0, and then the RNN layer before adding embedding layer or mask layer to filter out the top-up characters. Specifically in my blog post.

Http://www.cnblogs.com/leeshum/p/6089286.html

Not to be continued. (moved to the brick)

Lstm combing, understanding, and Keras realization (i)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More