Machine Learning Algorithm Summary (eight)--Generalized linear model (linear regression, logistic regression)

Source: Internet
Author: User
Tags ord

Both logistic regression and linear regression are one of the generalized linear models, and then let's explain why this is the case.

1. Exponential family distribution

Exponential family distribution and exponential distribution are not the same, in the probability of statistical distribution can be expressed in the exponential family distribution, such as Gaussian distribution, Bernoulli distribution, polynomial distribution, Poisson distribution. The expression of an exponential family distribution is as follows

  

whichηIt's natural parameter,T(y) is sufficient statistics,exp? A(η) is to play the role of normalization. Identified the T , a , b " > T a , b , we can determine that a parameter is & #x03B7; " > The exponential family distribution of the η .

T ( y ) "> exp & #x2212; a ( & #x03B7; ) ) "> T , a , b " > & #x03B7; "> A lot of the familiar probability distributions in statistics are specific forms of exponential family distribution. Here we describe the Bernoulli distributions and Gaussian distributions, thus deriving the expressions for logistic regression and linear regression

& #x03B7; "> T ( y ) "> exp & #x2212; a ( & #x03B7; ) ) "> T , a , b " > & #x03B7; "> 1) Bernoulli distribution

& #x03B7; "> T ( y ) "> exp & #x2212; a ( & #x03B7; ) ) "> T , a , b " > & #x03B7; "> We will represent the Bernoulli distribution in the form of exponential family distribution

& #x03B7 ; " > T ( y ) "> exp & #x2212; a ( & #x03B7; ) ) "> T , a , b " > & #x03B7; ">

The Bernoulli distribution is written in the form of exponential family distribution, and each of the exponential family distribution is split, then there is

  

We can derive the expression of φ according to the above formula, the form of which is the form of the sigmoid function.

  

  2) Gaussian distribution

To represent the Gaussian distribution in the form of an exponential family

  

Here we assume that the variance is 1, simplifying the formula, which is convenient for our derivation. Splits each item in the exponential family distribution

  

2. Generalized linear model

Whether we are doing classification or regression, we are predicting a function relationship between a random variable y and a random variable x. Before we deduce a linear model, we need to make three assumptions:

1) P (y|x;θ) subject to exponential family distribution

2) Given x, our aim is to predict the expectation of T (Y) under condition x. In general t (y) = y, which means we want to predict H (x) = E[y|x]

3) The parameter η and the input x are linearly Related: & #x03B7; = & #x03B8; T x " >η=θtx

With these three assumptions, we can begin to derive our linear model, which is called a generalized linear model for such a linear model.

  least squares (linear regression)

Assumep (y| X; θ) ~N< Span id= "mathjax-span-394" class= "Mo" > (μ , σ2 μ may be dependent on X, then there are

  

Because the output obeys the Gaussian distribution, it is expected to be μ, and then combined with the above three-day hypothesis, can deduce the expression of the linear regression. Therefore, the response variables of the linear regression model are subjected to Gaussian distribution (normal distribution).

  Logistic regression (LR)

Logistic regression is a two classification problem,Y∈{0,1}, for the two classification problem, we assume thatp(y|  x; θ) ~ Bernoulli ( ?), i.e. the response variable obeys the Bernoulli distribution. So there are

  

Therefore, we can see how the expression of logistic regression is derived, and why the sigmoid function is used to deal with the nonlinear problem.

3. Logistic regression

Logistic regression is evolved on the basis of linear regression, the logistic regression is actually a model to deal with two classification problem, output y∈{0, 1}, in order to satisfy such output, we introduce the sigmoid function to control the output value of the row number in the range of (0, 1) , the sigmoid function expression is as follows

  

  

Because the logistic regression is a two classification problem, obeys the Bernoulli distribution, the output result is expressed in the form of probability, can write the expression

  

To facilitate the subsequent analysis, we integrate the segmented function

  

For a given training sample, this is what has happened, in the probability of statistics that has happened should be the most probability of the event (the probability of a small event is not easy to happen), so you can use the maximum likelihood method to solve the model parameters, we will all samples of the joint distribution probability given

  

To facilitate the calculation, we convert the likelihood function into a logarithmic likelihood function

  

The function above is the maximum value, and our usual loss function is to find the minimum value, so we can go into

  

For the loss function J (W) is more complex, using the normal equation to obtain the solution of the parameter is very difficult, so the introduction of gradient descent method (the negative direction of the gradient is the loss function of the fastest descent direction), using gradient descent to minimize the loss function.

Machine Learning Algorithm Summary (eight)--Generalized linear model (linear regression, logistic regression)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.