July algorithm December machine learning online Class---20th lesson notes---deep learning--rnn

Source: Internet
Author: User

July algorithm December machine learning online Class---20th lesson notes---deep learning--rnn

July algorithm (julyedu.com) December machine Learning Online class study note http://www.julyedu.com

    1. Cyclic neural networks

Before reviewing the knowledge points:

Fully connected forward network: learning is a function

Convolutional networks: Convolution operations, partial links, shared operations, layer-wise extraction of the characteristics of the original image (voice, NLP)

The characteristics of learning

Local correlation

Shallow wide network is difficult to make neural network

?

1.1 States and models

1, ID data

• Classification issues

• Regression Problems

• Feature Expression

2, Most of the data do not meet the ID

• Most of the data does not meet the old

• Sequence analysis (Tagging, Annotation)

• Sequence generation, such as language translation, automatic text generation

• Content extraction (contents Extraction), image description)

You need to add the previous state to the current layer

1.2 Sequence Samples

1, input and output mapping relationship (application of sequence)

A, one-to-one: normal neural networks, without loops

B. One-to-many, look at the picture and talk

C. Many-to-one: emotional judgment

D: Many-to-many: language translation

E: sequence to sequence l/r/u/d

· RNN is not only able to process the output of the sequence, but also to get the sequence output, where sequence

Refers to a sequence of vectors.

. RNN Learning is a program, not a function .

?

1.3 Sequence Predictions

• The input is a sequence of time-varying vectors:

. The model is estimated at t time:

?

Problem

• Difficult to model and observe internal state

• Difficult to model and observe for long time-range scenarios (context)

• Solution: Introduce internal implicit state variables

The internal state, corresponding to the position

?

1.4 Sequence Prediction Model

• Input discrete column sequences

• Update calculation in time t

The above two graphs are equivalent to the H of the last t-1 moment and the current moment, together with the output.

• Predictive Computing

    1. W remains constant throughout the calculation process
    2. H initialization at 0 hours

?

1.4 RNN Training (1)

1, forward calculation, the same w matrix needs to multiply multiple times

2, the input x before the multi-step, will affect the current output

3, in the back calculation, the same matrix will be multiplied multiple times

1.4.1 B PTT algorithm one backprop Through time

1,RNN forward Calculation

2, to calculate the bias of the W, you need to add all time step, the loss function of each step is the same

3, apply chain rules

?

1.4.2 BPTT algorithm: Computational implementation

Chain rules for targets, using the differential of vectors

The calculation goal is sum,

If the sequence of 16, W transpose to multiply 16 times, resulting in an explosion phenomenon , according to time, easy to happen, there is a connection,, ordinary network, W has big small, gradient disappears, not very serious, each layer of w is not the same

?

Analysis of gradient vanishing/exploding phenomenon of BPTT algorithm

?

?

Solution of the 1.4.3 BPTT algorithm

1, clipping

2, W is initialized to 1, the activation function is replaced with Relu to Tanh

?

2 LSTM (long short term memory) Cell long memory capability

Through the structure of the method to solve the phenomenon of gradient dispersion and gradient explosion, to avoid a W from start to finish, with a certain common sense memory ability

The most widely used and successful RNN

?

2.1 Cell State (unit status)

?

1, you can save a state for a long time, the cell state value through the forget GAT (multiplication in the picture) control to preserve how much "old" status,

2, layer turns input dimension x into output dimension h

?

2.2 Forget/input Unit

As for Yes [0,1],b is the offset

2.3 Update Cell

2.4 Output

For summary, four matrix WF, WI,WC,WO

?

July algorithm December machine learning online Class---20th lesson notes---deep learning--rnn

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.