Machine Learning's Neural Network 1

Last Update:2018-04-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Organized from Andrew Ng's machine learning Course Week 4.

Directory:

Why use neural networks?
Model representation of neural Networks 1
Model representation of Neural Networks 2
Example 1
Example 2
Multi-Classification problem

1. Why use neural networks

When we have a lot of features: like $x_1, x_2,x_3.......x_{100}$

Suppose we now use a non-linear model with a polynomial maximum of 2 times, then for a nonlinear classification problem, if you use logistic regression:

$g (\theta_0+\theta_1x_1+\theta_2x_2+\theta_3x_1x_2+\theta_4x_1^2x_2+ ...) $

There are approximately $\frac{n^2}{2}$ characteristics, that is, O (N2), then when the number of polynomial is 3 times, the result is greater, O (N3)

The consequences of such a large number of features are: 1. The likelihood of overfitting is increased by 2. The calculation is expensive

To give a more extreme example, in the image problem, each pixel is equivalent to a feature, only for a 50*50 (already a very small picture) of the image, if it is a grayscale image, there are 2,500 features, RGB image has 7,500 features, for each feature has 255 values;

For such an image, if the use of two characteristics, there are about 3 million features, if it is also a logical return, the calculation of the cost is quite large

This time we need to use the neural network.

2. Neural network Model Representation 1

The basic structure of the neural network is as follows:

$x _0, x_1,x_2,x_3$ is the input unit, $x _0$ is also known as the bias unit, you can set the bias unit to 1;

$\theta$ are weights (or direct arguments) that connect input and output weight parameters;

$h _\theta (x) $ is the result of the output;

For the following network structure, we have the following definitions and calculation formulas:

$a _i^{(j)}$: The activation (which is the value of this unit) in the first unit of section J, the middle layer we call hidden layers

$s _j$: number of units on level J

$\theta^{(j)}$: Weight matrix, which controls the mapping from Layer J to Sub j+1,$\theta^{(j)}$ Dimension $s_{j+1}* (s_j+1) $

The formula for $a^{(2)}$ is:

$a _1^{(2)}=g (\theta_{10}^{(1)}x_0+\theta_{11}^{(1)}x_1+\theta_{12}^{(1)}x_2+\theta_{13}^{(1) x_3}) $

$a _2^{(2)}=g (\theta_{20}^{(1)}x_0+\theta_{21}^{(1)}x_1+\theta_{22}^{(1)}x_2+\theta_{23}^{(1)}x_3) $

$a _3^{(2)}=g (\theta_{30}^{(1)}x_0+\theta_{31}^{(1)}x_1+\theta_{32}^{(1)}x_2+\theta_{33}^{(1)}x_3) $

So in the same vein,

$h _\theta (x) =a_1^{(3)}=g (\theta_{10}^{(2)}a_0^{(2)}+\theta_{11}^{(2)}a_1^{(2)}+\theta_{12}^{(2)}a_2^{(2)}+\ theta_{13}^{(2)}a_3^{(2)}) $

3. Neural network Model Representation 2

Forward propagation:vectorized Implementation

The vectorization of the above formula means:

$z _1^{(2)}=\theta_{10}^{(1)}x_0+\theta_{11}^{(1)}x_1+\theta_{12}^{(1)}x_2+\theta_{13}^{(1) x_3}$

$a _1^{(2)}=g (z_1^{(2)}) $

The written vector is:

$ a^{(1)}=x= \begin{bmatrix} x_0 \ x_1 \ x_2 \ X_3 \end{bmatrix} $ $ z^{(2)}=\begin{bmatrix} z^{(2)}_1 \ z^ {(2)}_1 \ z^{(2)}_1 \end{bmatrix} $ $\theta^{(1)}= \begin{bmatrix} \theta^{(1)}_{10} & \theta^{(1)}_{11} &am P \theta^{(1)}_{12} & \theta^{(1)}_{13} \ \theta^{(1)}_{20} & \theta^{(1)}_{21} & \theta^{(1)}_{22} & \ theta^{(1)}_{23} \ \theta^{(1)}_{30} & \theta^{(1)}_{31} & \theta^{(1)}_{32} & \theta^{(1)}_{33} \ \end{ bmatrix}$

So:

$z ^{(2)}=\theta^{(1)}a^{(1)}$

$a ^{(2)}=g (z^{(2)}) $

Plus $a^{(2)}_0=1$:

$z ^{(3)}=\theta^{(2)}a^{(2)}$

$a ^{(3)}=h_\theta (x) =g (z^{(3)}) $

The above is the way to quantify the expression.

For each $a^{(j)}$ will learn different characteristics

4. Example 1

First look at a classification problem, Xor/xnor, for $x_1,x_2 \in {0,1}$, when X1 and X2 are different (0,1 or 1,0), Y is 1, same time y is 0;y=x1 xnor n2

For a simple classification problem and:

The following neural network structure can be used to obtain the correct classification results.

Similarly, for or, we can design the following networks and get the right results.

5. Example 2

Then the above example, fornot, the following network structure can be categorized:

Let's go back to the problem that was originally mentioned in the example: Xnor

When we combine these simple examples (and, or, not), we get the correct network structure to solve the Xnor problem:

6, multi-classification problems

In the neural network to solve the multi-classification problem, but also with the idea of one vs all, in the two classification problem, we are the output is not 0 is 1, and in the multi-classification problem, the result of the output is a one hot vector, $h _\theta (x) \in r^k$, K represents the number of categories

For example, for a 4-class problem, the output might be:

Category 1:$\begin{bmatrix} 0 \ 0 \ 0 \ 1 \end{bmatrix}$, category 2:$\begin{bmatrix} 0 \ 0 \ 1 \ 0 \end{bmatrix}$, category 3:$\begin{ Bmatrix} 0 \ 1 \ 0 \ 0 \end{bmatrix}$, etc.

You can not put $h_\theta (x) $ output to 1,2,3,4

Machine Learning's Neural Network 1

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine Learning's Neural Network 1

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine Learning's Neural Network 1

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support