Understanding the role of activation function in the construction of neural network model

Source: Internet
Author: User
What is an activation function

When biologists study the working mechanism of neurons in the brain, it is found that if a neuron starts working, the neuron is a state of activation, and I think that's probably why a cell in the neural network model is called an activation function.

So what is an activation function, and we can begin to understand it from the logistic regression model, the following figure is a logistic regression classifier:

In the above figure, we find that the logistic regression classifier, after the linear addition of all inputs (Net IPT functions), results in an activation function (Activation function), at which time the output:

In the logistic regression classifier, after removing the error and the unit step function, the remaining thing is a neuron.
A neural network is a combination of the width and depth of a plurality of neurons, the popular point of understanding, the activation function is the neural network in the output of each neuron multiplied by the function. For example, in the following image:

All of the hidden layers of neurons (a) and the neurons (Y) of the output layer are actually going through an activation function, so why is the input layer (x) not there, because although in the neural network, the input layer, the hidden layer and the output layer are represented by the "circle" shown above, but the input layer is not a neuron ...
So in a neural network, the activation function (Activation functions) generally chooses what kind of function:

In addition, in the deep neural network, the more commonly used is the Relu (rectified Linear Units) function, which we will introduce in the last section. the role of the activation function

We continue with the above 3-layer neural network model and represent the first-layer weight coefficients (for example, the weight coefficients of the input layer x1 and the hidden layer A1), and the second-layer weight coefficients (for example, the weight coefficients that represent the Y1 of the hidden layer A1 and the output layer), if the neural network model removes the activation function (Activation function), the output of each neuron can be expressed as:

You can get the relationship between Y and X when you bring it in:

The final output:

As can be seen, if there is no activation function, no matter how we train the parameters of the neural network, the resulting will be a linear model, in a two-dimensional space is a line, in three-dimensional space is a plane. The linear model has a very large limitation, such as the following problem:

We can never use a linear model to distinguish between orange and blue points, and when we join the activation function, using the above network structure is able to solve the linear non-point problem. (note that the network in the figure below is just a different type of network than the above equation)

So, finally, a summary: the function of activation function in neural network is to produce nonlinear decision boundary (non-linear decision boundary) by nonlinear combination of weighted input. activation functions in deep neural networks

The last part, explaining the activation function in the deep neural network, is the same as the shallow network- increasing nonlinearity , but using the Relu (rectified Linear Units) function, The main purpose is to solve the gradient vanishing problem caused by the sigmoid function (this is not the focus of this article, we do not elaborate on it). The following diagram is the Relu function:

You can see that it is a piecewise linear function, for all numbers less than or equal to 0, f (x) = 0, and f (x) =x for all numbers greater than 0. This function can be used as the activation function of neural network in that, in the multidimensional space, any surface can be decomposed into a multi-segment plane , this surface is the final decision surface, while the deep neural network relies on complex network results and depth of multiple planar fitting decision-making surface, finally achieve satisfactory results.

Reference:
"Machine learning" Tom M.mitchell
"TensorFlow Google deep Learning framework"
The role of activation function in neural networks
The excitation function of the popular Understanding Neural network (Activation functions)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.