Analysis of Logistic regression model

Last Update:2015-08-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This paper mainly discusses two parts, first introduce the simplest linear regression model, then analyze the logistic regression.

1. Linear regression ---least squares

For the linear regression problem, we divide it into linear regression and multivariate linear regression according to the number of independent variables, this part first introduces the linear model in detail and then generalize it to the multivariate linear model.

1) One-dimensional linear model

When the input has only one argument, we call it a unary linear model. (Most simple)

Set the sample set to: (Xi,yi), i=1,2,..., m.

The goal is: to find a line on the plane, so that the sample point as much as possible in this line.

The one-dimensional linear model is: H (x) =ax+b, and the output error is si=yi-h (xi).

The overall loss of the sample is:

In order to minimize the overall loss function, we use the least squares method. Since the overall loss function is a convex function, its minimum value is the smallest value.

First, the partial derivative is obtained, and the biased derivative is 0.

Organized by:

Two equations, two variables, can be directly used to obtain a B.

The calculation method of parameter A and B is shown in reference [2].

2) Multivariate linear model

When there are multiple input arguments, we call it a multivariate linear model.

Set the multivariate linear model to: H (x) =a0+a1x1+a2x2+...+anxn

The error for a single sample is: Si=yi-h (xi)

The overall error is:

Biased for each parameter and assigned 0:

There are n equations, n variables, which can be evaluated for each variable AI value.

The computational process of variable AI is described in reference [2].

2. Logic (Logistics) regression

Logistic regression can be divided into two categories and multiple classifications, which are discussed below:

1) Two logistic regression (two classifications)

If we need to classify a class of items two now, we will start with a linear combination of multiple features of the item, which is somewhat similar to the multivariate linear model we discussed above. It's just that we don't need to fit the point on the plane (space), but we need to differentiate the different categories of points on the plane (space).

The multivariate linear model is: H (x) =a0+a1x1+a2x2+...+anxn

We can classify items directly using multivariate linear models, by setting a threshold, and then dividing all samples of H (x) greater than the threshold into one category, others in another. However, there is a problem in this way, because the value of H (x) is arbitrary size, the selection of the threshold is a difficult thing, if we treat it normalized, then the selection of the threshold is relatively much simpler.

Set the valve value to: t, then

For ease of expression, set:

Here we use the sigmoid function to normalized it.

At this point, if we use the square minimum error function to estimate the parameters, since the normalized function is a non-convex function, it is not possible to use the gradient descent method to find its minimum value. However, we use the method of maximum likelihood estimation to estimate the model parameters.

Because it is a two classification, you can set:

So the likelihood function is:

Logarithmic likelihood function:

The maximum value of L (a) is obtained and an estimate of a is given. To be able to use the gradient descent algorithm, we put a minus sign in front of the logarithmic likelihood function, so we can find its minimum value:

Each time the parameter a moves a small step to the negative gradient direction of the logarithmic likelihood function.

The derivation process is simple and interesting to see references [2]

Finally, the value of a is updated:

2) Multiple Logistic regression

The two logistic regressions are described above, and the logistic regression of multiple classifications is somewhat similar, if the categories have K classes, the preceding K-1 classes are calculated using the following formula:

For Class K:

Summarize

Logistic regression can be used not only to deal with regression problems, but also to deal with two classification and multi-classification problems. The main process is to return the input linear weighting and then normalized to (0,1) within this interval, its normalization using the sigmoid function. If the regression problem, the normalized value can be directly output, if the classification problem, you can according to the method described above two classification and multi-classification.

Reference documents:

[1] peghoty, http://blog.csdn.net/itplus/article/details/10857843

[2] Hangyuan Li, statistical learning methods.

Analysis of Logistic regression model

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Analysis of Logistic regression model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Analysis of Logistic regression model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support