Understanding the role of activation function in the construction of neural network model

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is an activation function

When biologists study the working mechanism of neurons in the brain, it is found that if a neuron starts working, the neuron is a state of activation, and I think that's probably why a cell in the neural network model is called an activation function.

So what is an activation function, and we can begin to understand it from the logistic regression model, the following figure is a logistic regression classifier:

In the above figure, we find that the logistic regression classifier, after the linear addition of all inputs (Net IPT functions), results in an activation function (Activation function), at which time the output:

In the logistic regression classifier, after removing the error and the unit step function, the remaining thing is a neuron.
A neural network is a combination of the width and depth of a plurality of neurons, the popular point of understanding, the activation function is the neural network in the output of each neuron multiplied by the function. For example, in the following image:

All of the hidden layers of neurons (a) and the neurons (Y) of the output layer are actually going through an activation function, so why is the input layer (x) not there, because although in the neural network, the input layer, the hidden layer and the output layer are represented by the "circle" shown above, but the input layer is not a neuron ...
So in a neural network, the activation function (Activation functions) generally chooses what kind of function:

In addition, in the deep neural network, the more commonly used is the Relu (rectified Linear Units) function, which we will introduce in the last section. the role of the activation function

We continue with the above 3-layer neural network model and represent the first-layer weight coefficients (for example, the weight coefficients of the input layer x1 and the hidden layer A1), and the second-layer weight coefficients (for example, the weight coefficients that represent the Y1 of the hidden layer A1 and the output layer), if the neural network model removes the activation function (Activation function), the output of each neuron can be expressed as:

You can get the relationship between Y and X when you bring it in:

The final output:

As can be seen, if there is no activation function, no matter how we train the parameters of the neural network, the resulting will be a linear model, in a two-dimensional space is a line, in three-dimensional space is a plane. The linear model has a very large limitation, such as the following problem:

We can never use a linear model to distinguish between orange and blue points, and when we join the activation function, using the above network structure is able to solve the linear non-point problem. (note that the network in the figure below is just a different type of network than the above equation)

So, finally, a summary: the function of activation function in neural network is to produce nonlinear decision boundary (non-linear decision boundary) by nonlinear combination of weighted input. activation functions in deep neural networks

The last part, explaining the activation function in the deep neural network, is the same as the shallow network- increasing nonlinearity , but using the Relu (rectified Linear Units) function, The main purpose is to solve the gradient vanishing problem caused by the sigmoid function (this is not the focus of this article, we do not elaborate on it). The following diagram is the Relu function:

You can see that it is a piecewise linear function, for all numbers less than or equal to 0, f (x) = 0, and f (x) =x for all numbers greater than 0. This function can be used as the activation function of neural network in that, in the multidimensional space, any surface can be decomposed into a multi-segment plane , this surface is the final decision surface, while the deep neural network relies on complex network results and depth of multiple planar fitting decision-making surface, finally achieve satisfactory results.

Reference:
"Machine learning" Tom M.mitchell
"TensorFlow Google deep Learning framework"
The role of activation function in neural networks
The excitation function of the popular Understanding Neural network (Activation functions)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Understanding the role of activation function in the construction of neural network model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Understanding the role of activation function in the construction of neural network model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support