The activation function of machine learning

Source: Internet
Author: User

This article and we share the main is the machine learning activation function related content, together look at it, hope to learn from you Machine Learning helpful.

The activation function converts the last layer of the neural network output as input. Also used between two layers of neural networks.

So why should the activation function be made in the neural network?

For example, in logistic regression, the output is converted to 0/1 for classification. Used in neural networks to determine the output is yes/no. or map the output to a range, such as handwritten numeral recognition, to map the output to 0--9 .

activation functions General Classification two classes : linear and nonlinear

linear or identity activation functions

As the above function, the output is not limited to any extent and does not correspond to our purpose above.

Nonlinear activation functions

  There are several terms for activation functions to understand:   derivative or differential  : When the optimization method is related to the gradient, it needs to be directed, Therefore the function must be micro.   monotonicity  : When the activation function is monotonous, A single-layer network can be guaranteed to be a convex function.   output value range

: when the output value of the activation function is Limited , the gradient-based optimization method is more stable, because the representation of the feature is more significantly affected by the finite weights; when the output of the activation function is infinite , the model training will be more efficient, but in this case, smaller learning rate is generally required.

Here are a few common activation functions:

sigmoid function

as above, the output is always between 0--1, where the speed of change is slower when it is close to 0 or 1 . Useful when predicting the likelihood of a model. The function is micro, so the slope can be calculated between two points. The function is monotonous but its conduction function is not monotonous. This activation function causes the neural network to get stuck during training, some of the disadvantages are as follows:

1. when the input is over the hour, the gradient is close to 0. Therefore, when the initial value is very large, the neuron gradient disappears and the training difficulty is increased.

2. The average value of this function output is not 0. Therefore, the latter layer of the neuron will be a non-0 output of the previous layer as a signal input, the gradient is always positive.

Tanh hyperbolic sine activation function

similar to sigmoid, but better than sigmoid , Output is between -1--1 . Unlike sigmoid, the function output has a mean value of 0. commonly used in two classification problems.

Relu ( linear rectification ) activation function

currently this is the most active function used in neural networks, most of which are used in convolutional neural networks and deep neural networks. as above, the range is between 0--Infinity. Where the function and its reciprocal are monotonous. Some of the advantages are:

1. Convergence speed is much faster than sigmoid and Tanh

2. compared to sigmoid and Tanh, due to functional characteristics, only a threshold is required to get the activation value and there are drawbacks, such as a very large gradient flow through a Relu neurons, after updating the parameters, because the activation value is too large, resulting in subsequent data activation difficult.

Softmax activation function

Softmax is used in multi-classification process, it will be the output of multiple neurons, mapped to the (0,1) interval, can be seen as a probability to understand, so as to carry out multi-classification!

Why does the above mentioned derivative or differentiable: When updating gradients in gradient descent, you need to know the slope of the curve and update it, as this is the quickest direction to fall. Therefore, the derivative of the activation function needs to be used in the neural network.

Source: Network

The activation function of machine learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.