UFLDL Softmax Regression

Last Update:2016-05-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The so-called Softmax regression is an upgraded version based on the logistic regression.
Logistics is a two category, and Softmax can be categorized in multiple categories.

1 Logistic regression

Before we learn Softmax regression, we first return to the relevant knowledge of the logistic regression.
(See HTTP://BLOG.CSDN.NET/BEA_TREE/ARTICLE/DETAILS/50432411#T6)
The function of the logistic regression is
His name is regression, but the function is classification, because the original data regression curve is a logistic curve, their values are either close to 1 or close to 0.
In addition, their objective function is the maximum likelihood estimate obtained by multiplying the probabilities:

In order to calculate the general logarithm, L fit well at maximum.
The iterative formula obtained by using the gradient descent method is the same as the iterative formula of linear regression, which is good coincidence. There are other algorithms (Perceptron Learning algorithm) in the form of iterations as well.
The cost function in the paper is processed, and the minimum value is directly calculated.

2 Softmax Regression

Above can see the formula of the logistic main idea is to use the probability, when y equals 0 o'clock to get is equal to 0 probability, Y=1 is equal to 1 probability.
The idea of Softmax is to get the probability of multiple classifications separately, the formula is as follows:

A good understanding of the following style:
It can be seen as a matrix form of K-formulas. The probability that a group of X will get a k category.
Now that we've made assumptions about the probabilities of each category,
Then you can imitate the maximum likelihood to get the following cost function

If the input x corresponds to the category J, then the corresponding probability is.
Multiplying the probability of all m x is the maximum likelihood function, and finding the minimum value after adding the minus sign to the log (product disguised addition) is equivalent to the maximum likelihood value.
The cost function above is the meaning.
The iterative formula for its gradient is as follows

3 parameter characteristics of Softmax

After knowing its rationale, we think of a problem, if we know the probability of the first k-1 classification, then the probability of the k will need to know? Obviously is not necessary, this can be understood as Softmax parameter redundancy overparameterized Direct understanding (Bo Master Understanding, cautious letter).
The understanding in this article is more rigorous:
Subtract a vector directly from the probability formula ψ The formula for getting the probability is still unchanged.

That is to say, the result of the optimization result is that the optimal solution of the condition is still satisfied by subtracting the vector, in other words, the optimal solution has countless. That is Hessian matrix is singular (singular/non-invertible), using Newton method is not good use.
According to the above understanding we can set the parameters of one of them all to 0, so that there is no redundancy, but in practice we do not do this, but rather add the rule phase, but this is not called the rule and called weight Decay.

4 Weight Decay

What it looks like after you add a penalty

Since the next item is bound to be greater than 0, then the Hessian matrix will not be irreversible, so it becomes a strict convex function, all kinds of solutions can be used. can also be directly understood as, all parameters have two new constraints or optimization direction, so the best solution is only one. The iteration formula is as follows:

5 Softmax VS. Binary classifiers

When we have K classification, do we choose Softmax or K two classification?
The answer is that if this K classification is mutually exclusive, we choose Softmax, but if it is overlapping with each other part of the failure, such as in men, women, children, little girls Such classification can not be used Softmax

UFLDL Softmax Regression

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

UFLDL Softmax Regression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

UFLDL Softmax Regression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support