Starting from zero depth learning to build a neural network (i)

Starting from zero depth learning to build a neural network (i) _ Neural network

Last Update:2018-08-22 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Artificial intelligence is not mysterious, will be a little subtraction enough.

For neurons, when nerves are stimulated, the neurotransmitter is released to the next neuron, and the amount of neurotransmitters released by the next neuron is different for different levels of stimulation, so mimic this process to build a neural network:

When entering a data x, simulate input an outside stimulus, after processing, the output of the result is f (x), the F (X) passed to the next neuron, step-by-step solution, the final output of a value Z, compared with the given value (supervised learning), according to the results of the internal function of each neuron to adjust the parameters, Called a reverse diffusion, and then repeats the process in the next round until the specified number of iterations is reached, by constantly adjusting the parameters to the point where the minimum loss (lost) is approached.

It can be seen that the output value z should have a range, such as the identification of a group of pictures is a cat, in the training sample, the result should be 0 or 1, then the output should correspond to 0 or 1, so you need to map to 0/1. And if the number of categories increases, then the scope should be expanded accordingly.

So, for example, learn:

Input as x, such as a picture, Pixel is 34*34pixels, then according to the RGB color principle, a total of 34*34*3 data, so the X is mapped to a (34*34*3, 1) matrix (generally using the conversion matrix of W, that is wt), according to R (34*34), G (34* ), B (34*34) is arranged.

The next step is to determine the f (x) function, first take a simpler linear (first) neural network, the more common one-dimensional first-order equation in the form of f (x) = Ax+b, so in a similar form:

f (x) = w*x + b

In which, W is the weight weight, B is the bias deviation, that is, learning adjustment parameters, that is, adjust the value of W and B.

Attention:

Matrix multiplication is required for the size of the matrix, where x is the specification (34*34*3, 1), so the number of W columns should also be 34*34*3, if we have only one layer of neurons, then f (x) is a number, that is (1, 1) of the matrix, so the size of the W should be (1, 34*34*3), b is just a constant.

The range of linear equations is known to be (-∞,+∞), so we need to map to [0,1] intervals, and we need an equation:

Sigmoid (z) = 1/(1 + e^ (-Z))

Mapping the variable to 0, 1, the reason for choosing this equation is that the derivation is easy, S ' (x) = s (x)/(1–s (x)).

Whether the evaluation parameter is reasonable also needs an equation, namely loss equation (Loss), meaning is the deviation between the predicted value and the standard (supervision) value, minimizing this deviation to achieve better prediction (classification) effect, the more commonly used loss equation is:

Definition, A is the predicted value, and Y is the exact value.

L (A, y) =-y * log (a)-(1-y) * Log (1-A)

If there are m training examples, the mean loss function is added to the value and divided by M, that is

L = SUM (Li)/M

After calculating the cost, we need to adjust the value of W and B, because we need to reduce the value of the cost, so we need to get the gradient value for the L equation derivation (gradient), and gradually forward to the minimum value, where ' DW ', ' DB ' is for easy representation in Python code, and the real meaning is the right equation (differential):

' DW ' = DJ/DW = (dj/dz) * (DZ/DW) = x* (a-y) t/m

' db ' = dj/db = SUM (a-y)/M

So the new values are:

w = w–α* DW

b = b–α* db, where alpha is the learning rate, with the new W, b in the next iteration.

Set the number of iterations, after the iteration, is the final parameter W, b, using test cases to verify the recognition accuracy, generally should be more than 70%.

In my blog "Know the crawler", there will be crawling data for training code writing process, only need to understand the neural network process.

The specific code in the Https://github.com/Lee-Jiazheng/My_neural_network.git base_func.py, the implementation of a simple neural network, the follow-up will increase the number of neurons and optimization speed.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Starting from zero depth learning to build a neural network (i) _ Neural network

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Starting from zero depth learning to build a neural network (i) _ Neural network

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support