Stanford Machine Learning Open Course Notes (6)-Neural Network Learning

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Public Course address:Https://class.coursera.org/ml-003/class/index

INSTRUCTOR:Andrew Ng

1. Cost Function ( Cost functions )

The last lecture introduced the multiclass classification problem. The difference between the multiclass classification problem and the binary classification problem lies in that there are multiple output units, which are summarized as follows:

At the same time, we also know the price functions of Logistic regression as follows:

The first half represents the difference between the real value and the hypothetical value, and the second half represents the deviation item for normalization of the coefficient. Like this, we can define the cost function of a neural network:

The distance between the actual value and the hypothetical value is defined as the sum value between all samples and output categories, followed by the Normalized Deviation of the weight.

2. Backpropagation Algorithm ( Reverse PropagationAlgorithm )

Since we have already given the form of a cost function, the old idea is to minimize the number of parameters:

To perform gradient descent, We need to list the inputs and outputs of each layer of the neural network. The representation method is the same as the previous one:

Here we need to define an error variable Delta to indicate the influence of the node on the occurrence of the final error. For the last layer, we can directly define the error value. For the hidden layer above, we can only solve the problem through reverse derivation, which is also the origin of the reverse propagation algorithm.

For detailed derivation process, see Wikipedia:

Http://en.wikipedia.org/wiki/Backpropagation

The gradient descent algorithm can be described as follows:

Use Delta to indicate the global error. Each layer corresponds to a delta (l ). Then D is introduced as the result of evaluate the parameters of the cost function. Whether J on the left is equal to 0 affects whether there is a final deviation.

3. Backpropagation intuition (reverse propagation example)

This section provides an example of using the BP algorithm to calculate the weight of a neural network. First, we need to define the network structure and some input/output representations:

At the same time, we will simplify the cost function here, so we do not need to consider the final normalization deviation items:

For the I-th sample, cost (I) is defined as follows. If you are familiar with the derivation process of Delta, you can find:

At the same time, for each layer, the Delta component is equal to all the Delta weighting and of the next layer. The weight is the callback parameter:

4. Gradient checking (gradient check)

In the process of solving the problem, check the gradient to determine if there is any problem with our code. For the following figure, take a point around the vertex (Θ + ε) and (Θ-ε), then the derivative (gradient) of a vertex is approximately equal to (J (Θ + ε) -J (Gini-ε)/(2 ε ):

The following formula is used to evaluate each parameter:

Since we can always get the derivative D (derivative) of J (derivative) in the BP algorithm, we can compare this approximate value with D, if the two results are similar, the code is correct; otherwise, the error occurs.

Note the following points:

5. Random initialization (random initialization)

For the theta parameter initialization problem, the simplest idea is to assign a value of 0 first:

However, this assignment makes no difference in hiding nodes at the beginning of the computation. Just like this, the calculation process and result of A1 and A2 are the same, which is equivalent to a single node, causing waste. To break this situation, you can perform random Initialization on Theta:

6. putting it together (General)

What do we need to do to train a neural network?

First, select the network structure:

Then the training weight includes initialization and BP algorithms:

Finally, check whether the trained parameters are correct. Use the gradient check method mentioned above:

This completes the training process of a neural network.

-------------------------------------------------- Weak split line ----------------------------------------------

The focus of this section is the BP algorithm. However, because there are not many details about the deduction in the video, the conclusion is not impressive and you must deduce it yourself. After the algorithm is added, the neural network has a self-learning process. You only need to define the structure and initial values of the neural network.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Stanford Machine Learning Open Course Notes (6)-Neural Network Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Stanford Machine Learning Open Course Notes (6)-Neural Network Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support