Logistic regression and generalized linear model learning Summary

Source: Internet
Author: User

The Linear Prediction of independent variables in the classic linear model is the estimated value of the dependent variable. Generalized Linear Model: The linear prediction function of independent variables is the estimated value of the dependent variable. Common generalized linear models include the probit model, Poisson model, and logarithm Linear Model. There are logistic regression and maxinum entropy in the logarithm Linear Model. This article is a summary of Logistic regression and the process of exporting a generalized linear model to logistic regression. The next article will summarize the learning of the Maximum Entropy model. The outline of this article is as follows:

1. Logistic Distribution and logit Conversion

2. Why does the traditional linear regression model be used instead of Logistic regression in binary classification?

3. How does one solve the logistic regression model?

4. Experience in data preprocessing during actual logical regression. However, we have limited experience. If anyone has a wealth of experience, I 'd like to thank you first.

5. In practice, the objective function of the classic linear model is the least squares, while the logistic regression is the likelihood function.

6. Export the classic linear model and logistic regression from the perspective of the most fundamental Generalized Linear Model

1. Logistic Distribution and logit Conversion

A continuous random variable X. If its distribution function is in the following form, X follows the logistic distribution. The value of f (x) ranges from 0 ~ 1. Its graph is an S-curve.

2. Why does the traditional linear regression model be used instead of Logistic regression in binary classification?

When linear regression is used for binary classification, we first think of the following form. P is the probability of classification:

However, the following problems exist:

1) The values on both sides of the equation are different. The value on the right is negative infinity to positive infinity, and the value on the left is [0, 1]. The classification model has problems.

2) In reality, many problems occur when X is very small or large, and has little impact on the dependent variable P. When x reaches a threshold in the middle, it has a great impact. That is, there are many problems in reality. The relationship between probability P and the independent variable is not a straight line.

Therefore, the classification model above needs to be trimmed. How can this problem be corrected? One method that statisticians find is to transform the dependent variables using the Logit function, as shown below:

Thus,

Here P completely solves the above two problems.

3. How does one solve the logistic regression model?

1) Solution

In logistic regression, Y is subject to two distributions, and errors are subject to two distributions, rather than Gaussian distributions. Therefore, we cannot use the least squares to estimate model parameters, maximum likelihood estimation can be used for parameter estimation.

2) Likelihood function and Target Function

A more rigorous formula is as follows:

The likelihood function is as follows:

Logarithm likelihood function. The optimization objective function is as follows:

The entire logistic regression problem is transformed into solving the objective function, that is, the maximum value of the logarithm likelihood function, that is, the optimization problem. Gradient Descent Method and quasi-Newton method can be used.

4. Experience in data preprocessing during actual logical regression is summarized, but experience is limited. If anyone has a wealth of experience in this field, I would like to thank you first.

1) Enumeration features are directly carried out in binary

2) numeric features: standardized and binary based on Distribution

3) pairwise

5. In practice, the objective function of the classic linear model is the least squares, while the logistic regression is the likelihood function.

The following formula is copied directly from ng notes.

1)Classic Linear ModelThe following equation is met:

Here is a hypothesis, that is, the error disturbance is independently distributed in the normal distribution where the mean is 0, that is:

Thus:

Because of the preceding assumptions, the following likelihood functions are available:

Thus, this linear regression problem can be transformed into Maximizing the logarithm likelihood estimation below. Because the item before the formula below is a constant, this problem is equivalent to minimizing the last item in the equation below, that is, least mean squares.

2)Logistic RegressionMedium, the dependent variable Y is no longer a continuous variable, but a binary value {0, 1}. The logit transformation is used in the middle to map the continuous y value to a reasonable 0 ~ through this transformation ~ 1. Generalized Linear regression is used in classification. It also has a hypothesis (corresponding to the same distribution and normal distribution of error items in the above regression problem). h (x) is a logistic function.

That is, given X and parameters, Y follows two distributions. In the above regression problem, given X and parameters, Y follows a normal distribution. Thus.

Different hypotheses correspond to different problems (one is classification and the other is regression), which determines that the objective function of the logistic regression problem is the above, rather than the mean square error LMS in the regression problem.

6. Export the classic linear model and logistic regression from the perspective of the most fundamental Generalized Linear Model

1) Exponential Family

When T is fixed, which of the following distributions belongs to the exponential family is determined by the functions A and B. The following is the bernuoli distribution, which corresponds to the logistic regression problem.

Note: As we can see from the above, we can use GLM to guide logistic data later.
This sigmoid function is used in regression.

The following is the Gaussian distribution, which corresponds to the classic linear regression problem.

2) GLM (Generalized Linear Model)

The problem of the exponential family can be solved through the generalized linear model. How to Build GLM? Given X and parameters, the conditional probability P (Y | X, θ) of Y must satisfy the following three assumptions:

Assum1) y | X; θ ∼ exponentialfamily (ETA ).

Assum2) h (x) = E [Y | x]. Given X, the goal is to predict the expectation of T (Y). In general, T (y) = y

Assum3) θ = θ Tx, that is, the X and X are linear.

3) Classic linear regression and Logistic Regression

Classic linear regression: the predicted value Y is continuous. Given X and the parameter, the probability distribution of Y follows the Gaussian distribution (corresponding to the first assumption of building GLM ). According to the relationship between the Gaussian distribution and the exponential family distribution above, we can see that the model can be expressed:


Logistic regression: Taking binary classification as an example, the predicted y value is {} of the binary value. Given X and the parameter, the probability distribution of Y follows the bernuoli distribution (corresponding to the first assumption of building GLM ). According to the relationship between the Gaussian distribution and the exponential family distribution, we can see that the model can be expressed:

From the perspective of GLM, we can understand why the logistic regression formula is in this form ~


References:

[1] ng lecture notes, http://cs229.stanford.edu/notes/cs229-notes1.pdf

[2] other network resources

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.