Softmax Regression of UFLDL tutorial

Last Update:2015-08-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

About Andrew Ng's machine learning course, there is a chapter devoted to logistic regression, and the specific course notes are in another article.

Here is a simple summary of logistic regression:

Given a sample to be classified x, using the logistic regression model to determine the class of the input sample, it is necessary to do the following two steps:

① calculates the value hθ (x) of the logical regression hypothesis function, where n is the characteristic dimension of the sample

② if hθ (x) >=0.5, then x enters the positive class, otherwise x is a negative class

or directly using discriminant boundary to judge, namely: if θ ' x>=0, then x input positive class, otherwise, x belongs to negative class

Therefore, logistic regression can only solve two kinds of classification problems.

************************

A very important problem here is that before classifying a classified sample, it is necessary to use the training sample to solve the parameter θ=[θ1,θ2, ..., partθn] of the logistic regression model.

The optimal model parameters are the parameters that make the cost function get the minimum value, so as long as the optimization is used, the parameters that make the cost function have the minimum value are found.

(1) The initial value of the given parameter theta

(2) using function Fminunc to optimize the cost function to obtain the optimal theta value

This function requires input of the initial value of the theta, the calculation function of the gradient and cost function, and other settings

(3) After the optimization of the theta value, for a sample to be classified, the calculation of its hypothetical function value or the determination of the value of the boundary, so that its classification can be distinguished

************************

The following starts with the Softmax regression problem

Section 0 Step: Initialize Parameters

① dimension of sample feature N (inputsize)

② number of sample categories K (numclasses)

③ Attenuation Item Weights (Lambda)

First step: manned mnist Data set

① Loading the Mnist dataset

② Loading tags for mnist datasets

In the tag set, label 0 represents the number 0, for subsequent processing convenience, the label of the number 0 is changed to 10

Note 1: In the experiment, sometimes for debugging convenience, in the debugging, you can add synthetic data;

Note 2: In the program folder, in order to classify conveniently, the Mnist dataset and the Mnist data set operation functions are stored in the Mnist folder, before use, add the statement Addpath mnist/can;

initial value of randomly generated parameter theta

Its dimension is k*n,k is the number of categories Numclasses,n is the feature dimension of the input sample Inputsize

Step Two: Write the Softmax function

The function is to calculate the cost function and the gradient of the cost function.

(1) Calculating the cost function

The ① cost function is calculated as follows:

The Vectorization calculation method in the ② program is as follows:

(2) Calculate gradient

The ① gradient is calculated as follows:

In the ② program, the Vectorization method is calculated as follows:

Step three: Gradient Test

After the Softmax function is written for the first time, the gradient test algorithm is used to verify the correctness of the written softmaxcost function. Call checknumericalcradient function directly

Fourth Step: Learning the regression parameters of Softmax using the training sample set

Fifth Step: For a sample to be classified, the Softmax regression model is used to classify it.

On the derivation of the formula for vectorization ******

(1) The form of a known parameter

The form of the ① parameter θ

② the form of input data x

③ The product form of θx , recorded as Matrix M

(2) Cost function for a single sample

① cost function for a single sample

That

② hypothetical function of a single sample

However, in practice, in order to avoid the overflow of the computation, we need to make the following adjustments to each of the hypothetical functions.

which

Then there are:

(3) Cost function form samples

Cost function for ① m samples

Hypothetical function of ② m samples

which

(4) Calculate gradient

① gradient calculation of a single sample

Note 1:

NOTE 2:

Calculation of ②m sample gradients

Then there are:

Softmax Regression of UFLDL tutorial

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Softmax Regression of UFLDL tutorial

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Softmax Regression of UFLDL tutorial

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support