Softmax Regression of UFLDL tutorial

Source: Internet
Author: User

About Andrew Ng's machine learning course, there is a chapter devoted to logistic regression, and the specific course notes are in another article.

Here is a simple summary of logistic regression:

Given a sample to be classified x, using the logistic regression model to determine the class of the input sample, it is necessary to do the following two steps:

① calculates the value hθ (x) of the logical regression hypothesis function, where n is the characteristic dimension of the sample

② if hθ (x) >=0.5, then x enters the positive class, otherwise x is a negative class

or directly using discriminant boundary to judge, namely: if θ ' x>=0, then x input positive class, otherwise, x belongs to negative class

Therefore, logistic regression can only solve two kinds of classification problems.


************************

A very important problem here is that before classifying a classified sample, it is necessary to use the training sample to solve the parameter θ=[θ1,θ2, ..., partθn] of the logistic regression model.

The optimal model parameters are the parameters that make the cost function get the minimum value, so as long as the optimization is used, the parameters that make the cost function have the minimum value are found.

(1) The initial value of the given parameter theta

(2) using function Fminunc to optimize the cost function to obtain the optimal theta value

This function requires input of the initial value of the theta, the calculation function of the gradient and cost function, and other settings

(3) After the optimization of the theta value, for a sample to be classified, the calculation of its hypothetical function value or the determination of the value of the boundary, so that its classification can be distinguished

************************

The following starts with the Softmax regression problem

Section 0 Step: Initialize Parameters

① dimension of sample feature N (inputsize)

② number of sample categories K (numclasses)

③ Attenuation Item Weights (Lambda)

First step: manned mnist Data set

① Loading the Mnist dataset

② Loading tags for mnist datasets

In the tag set, label 0 represents the number 0, for subsequent processing convenience, the label of the number 0 is changed to 10

Note 1: In the experiment, sometimes for debugging convenience, in the debugging, you can add synthetic data;

Note 2: In the program folder, in order to classify conveniently, the Mnist dataset and the Mnist data set operation functions are stored in the Mnist folder, before use, add the statement Addpath mnist/can;

initial value of randomly generated parameter theta

Its dimension is k*n,k is the number of categories Numclasses,n is the feature dimension of the input sample Inputsize

Step Two: Write the Softmax function

The function is to calculate the cost function and the gradient of the cost function.

(1) Calculating the cost function

The ① cost function is calculated as follows:

The Vectorization calculation method in the ② program is as follows:

(2) Calculate gradient

The ① gradient is calculated as follows:

In the ② program, the Vectorization method is calculated as follows:

Step three: Gradient Test

After the Softmax function is written for the first time, the gradient test algorithm is used to verify the correctness of the written softmaxcost function. Call checknumericalcradient function directly

Fourth Step: Learning the regression parameters of Softmax using the training sample set

Fifth Step: For a sample to be classified, the Softmax regression model is used to classify it.

On the derivation of the formula for vectorization ******

(1) The form of a known parameter

The form of the ① parameter θ

② the form of input data x

③ The product form of θx , recorded as Matrix M

(2) Cost function for a single sample

① cost function for a single sample

That

② hypothetical function of a single sample

However, in practice, in order to avoid the overflow of the computation, we need to make the following adjustments to each of the hypothetical functions.

which

Then there are:

(3) Cost function form samples

Cost function for ① m samples

Hypothetical function of ② m samples

which

(4) Calculate gradient

① gradient calculation of a single sample

Note 1:

NOTE 2:

Calculation of ②m sample gradients

Then there are:

Softmax Regression of UFLDL tutorial

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.