Generalized linear model (generalized Linear Models)

Source: Internet
Author: User

In the linear regression problem, we assume that in the classification problem, we assume that they are all examples of generalized linear models, and the generalized linear model is the estimation of the linear predictor function of the independent variable as the dependent variable. Many models are based on generalized linear models, such as the traditional linear regression model, the maximum entropy model, the logistic regression, and the Softmax regression.

Exponential distribution Family

Before understanding the generalized linear model, first look at the exponential distribution family (the exponential family)

The exponential distribution family prototype is as follows

If a distribution can be expressed in the form above, then this distribution belongs to the exponential distribution family, first to define the above form of the symbol:

η: Natural parameters of the distribution (natural parameter) or called standard parameters (canonical parameter)

T (y): sufficient statistics, usually with t (y) = y

A (η): Logarithmic partition function (log partition functions)

: Essentially a normalized constant that ensures probability and is 1.

When a T is given, A and B define an exponential distribution with the η parameter. We change η to get the different distributions of exponential distribution families.

Proving that Bernoulli distribution and Gaussian distribution are exponential distribution families

, the Bernoulli distribution mean φ, recorded as Bernoulli (φ), y∈{0, 1}, so p (y = 1;φ) =φ; p (y = 0;φ) = 1−φ

An expression that compares an exponential distribution family can be:

η= log (φ/(1-φ)) we will φ with η, then:φ=1/(1+e-η), is not found and sigmoid function.

This shows that, when given t,a,b, the Bernoulli distribution can be written in the form of exponential distribution families, also known as Bernoulli distributed exponential distribution family.

In the same vein, in the Gaussian distribution , there are:

Comparing exponential distribution families, we get:

Because the variance of the Gaussian distribution is independent of the hypothesis function, we set the variance =1for the sake of calculation, so we get :

So it also shows that the Gaussian distribution is also a kind of exponential distribution family.

Constructing generalized linear models (constructing Glms)

How to construct generalized linear model by exponential distribution family? To build a generalized linear model, we are based on the following three assumptions:

  1. Given the characteristic attributes and parameters, the conditional probability obeys the exponential distribution family, namely.
  2. Prediction of expectations, that is, calculation. #h (x) = E[y|x]
  3. and is linear, that is.

Constructing the least Squares model

In retrospect, in linear regression, the cost function y is obtained by the least squares method. The least squares model is constructed by the generalized linear model.

In linear regression, assuming that the y|x;θ obeys the Gaussian distribution N (μ,σ2) n (μ,σ2), according to our previous derivation, we know µ=η, so according to three fake with

Description

The first equals sign according to our hypothesis 2, =y, that is

The second equals sign according to the expectation of the Gaussian distribution for μ to get

The third equal sign is based on what we can get before, assuming that 1

The fourth equals sign is obtained according to Assumption 3.

At this point, the least squares model is built, which is the source of the linear model used in linear regression. The next task is to use the gradient descent, Newton method to solve θ

Building Logistic Regression

Logistic regression can be used to solve two classification problems, and the objective function of two classification problem is discrete value, we know that we can choose Bernoulli distribution to construct the model of logistic regression by statistical knowledge.

In the previous argument we get η= log (φ/(1-φ)) we will φ with η, then:φ=1/(1+e-η). according to three assumptions, we have

The build is complete, which is the model used in logistic regression.

Build Softmax Regression

Now we consider a multi-classification problem, that is, the response variable Y has k values, that is, Y∈{1 2, ..., k}, first of all, we have to prove that the distribution is also an exponential distribution family.

The output of the multi-classification model is the probability that the sample belongs to the K category, and we can use φ1, ..., φk to represent the probability of this k sample output. Φ1, ..., φk satisfied , but this parameter is somewhat redundant, so we use φ1, ..., φk−1 to express, then

Define T (Y) ∈rk-1 as follows:

Attention:

here T (y) is not equal to Y, where it is a vector of k-1 dimensions, not a real number. Specifies (T (y)) I denotes the element I of the vector T (y)

In addition, a new symbol is introduced, if true in curly braces, then the formula equals 1, and vice versa is 0, for example 1{2 = 3} = 0,1{3 =5−2} = 1,

So the relationship between T (y) and y can be expressed as ,

The relationship can be expressed as:

So there are:

So, polynomial distribution we can also be written as an exponential distribution family, that is, many distributions are also exponential distribution family. So we can use the generalized linear model to fit the

By the expression of the η can get:ηi=log (φi/φk) This is ηi about φi expression, convert it to φi about ηi , for convenience, we make, so there is

So we can find , in the equation of the red box , the function of this about is called the Softmax function (Softmax functions) .

Let's use the generalized linear construction model

According to hypothesis 3, there is ηi =θit x (For i = 1, ..., k−1), whereθ1, ..., θk-1∈rn+1 also here we define θk = 0 So you can get: ηk =θkt x = 0

So the distribution of y in the model under the given x condition is:

This model, which is applied to the multi-classification model, is called Softmax regression, which is the generalization of logistic regression.

For hypothetical functions, we have assumed that 2 can be obtained

So now the final step in solving the objective function is to fit the parameters. Maximum likelihood estimation is obtained

Maximum likelihood function to solve the optimal parameter θ, as described earlier, can use gradient ascending or Newton method.

Generalized linear model (generalized Linear Models)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.