Deep understanding of lstm neural Network

Last Update:2018-08-23 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article content and picture Main reference: Understanding Lstm Networks lstm Core thought

Lstm was first proposed by Hochreiter & Schmidhuber in 1997, designed to address long-term dependency problems in neural networks, and to remember that long-term information is the default behavior of neural networks, rather than requiring great effort to learn. LSTM Memory Unit

The following is an understanding of the various parts of the LSTM unit:

The key to the LSTM is the cell state, the horizontal line from left to right above the lstm unit in the diagram, which, like a conveyor belt, passes information from the previous cell to the next unit, with little linear interaction with the other parts.

LSTM controls the ability to discard or add information through gate to achieve forgetfulness or memory. "Door" is a kind of structure that makes information selectively pass through, which consists of a sigmoid function and a point multiplication operation. The output value of the sigmoid function is in the [0,1] interval, 0 is completely discarded, and the 1 represents the full pass. A LSTM unit has three such doors, namely, The Forgotten Gate (Forget gate), the input gate (input gate) and the output gate.

The Forgotten Gate (Forget gate): The forgotten Gate is the output ht−1 of the above unit and the input XT of this unit is the input sigmoid function, which produces a value in [0,1] for each item in ct−1 to control the degree of the last unit state being forgotten.
Input gate: Input gate and a tanh function cooperate to control what new information is added. The Tanh function produces a new candidate vector ct~, in which each item in the gate ct~ produces a value within [0,1], controlling how much new information is added. At this point, we have the output of the forgotten gate ft, to control the extent of the last unit has been forgotten, as well as the output of the input gate it, to control the number of new information to be added, we can update the unit state of the memory unit, ct=ft∗ct−1+it∗ct~.
Output gate: The output gate is used to control how much of the current unit state is filtered out. The unit state is activated first, and the output gate generates a value in [0,1] for each of the items, and the control unit state is filtered.
lstm var.

The lstm described above is a standard version, and not all lstm are exactly the same as described above. In fact, there seems to be a slight difference in the lstm of each essay.

A more popular lstm variant, shown in the following figure, was first proposed by Gers & Schmidhuber in 2000. This method increases the "peephole connections", where each door can be "spied" into the unit state. Here, the forgotten gate and the input gate are connected with the state of the previous unit, and the output gate is connected with the current cell state.

There's a variant that cancels the input gate, how much of the new information is added to the old state is set to a complementary two value (and 1), that is, we forget it only when we need to add new information, and we only add new information when we need to forget.

Another noteworthy variant looks fun, called gated Recurrent unit (GRU), first presented by Cho, et al. in 2014. This method connects the forgotten door and the input gate into a "Renewal gate" (Update Gate), also incorporates the hidden state HT and the unit state Ct, and the final result is simpler than the standard lstm.

Of course, there are a lot of variants that are not listed here. Some have specifically compared the LSTM variants and compared their results to show that these variants are similar in detail to Greff, et al. (2015) and Jozefowicz, et al. (2015).

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Deep understanding of lstm neural Network

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Deep understanding of lstm neural Network

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support