Machine Learning Study Notes (4)--Regression problem Summary: Generalized linear model

Last Update:2015-05-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This series of articles allow reprint, reproduced please keep the full text!

" Please read" "Description & General Catalogue" http://www.cnblogs.com/tbcaaa8/p/4415055.html

1. Introduction to exponential distribution families

In the previous article, we introduced the corresponding regression model of the dependent variable following Gaussian distribution, Bernoulli distribution, Poisson distribution and multi-item distribution, and this article will explain the common points of these models and generalize them.

First, the definition of exponential distribution family is informally given:

defines if the distribution of the variable y can be expressed as P (y;η) =b (y) exp (ηtt (y)-A (η)) (η is the parameter of the distribution), then the y follows the exponential distribution family

Meng Meng Bo Master Ability is limited, about the exponential distribution family can not give too much explanation. If you are interested in the nature of the exponential distribution family and the meaning of a (η), B (Y), T (y), please refer to Wikipedia.

Wikipedia index distribution family reference link: http://en.wikipedia.org/wiki/Exponential_family

2. From special to general: Verify that the above four distributions belong to the exponential distribution family

2.1 Gaussian distribution

The single-variable Gaussian distribution is deduced as an example. The probability density function of the Gaussian distribution is as follows:

Comparing exponential distribution family definitions, you can find:

So the Gaussian distribution belongs to the exponential distribution family.

2.2 Bernoulli distribution

The probability density function of the Bernoulli distribution is as follows:

Comparing exponential distribution family definitions, you can find:

It is proved that the Bernoulli distribution belongs to the exponential distribution family.

2.3 Poisson Distribution

The probability density function of the Poisson distribution is as follows:

Comparing exponential distribution family definitions, you can find:

This verifies that the Poisson distribution belongs to the exponential distribution family.

2.4 Multi-item distribution

Note: in the context of Softmax regression, the n1...nk of the independent variables of multiple distributions is limited, that is, assuming that N1...nk has a value of 1 and the remainder is 0. This is the only way to meet the K classification requirements. Under this restriction, a polynomial-distributed argument can be converted from a k-dimensional vector to a scalar with a value range of {1...K}, simplifying the operation. in this case, the probability density of the multiple distributions is as follows:

To this point, we can find that there is still a little gap with the exponential distribution family. Continue to convert the probability density to vector form:

Is there any sense of déjà vu? If the k=2 is taken, the polynomial distribution will degenerate into the Bernoulli distribution: The Bernoulli distribution is a special case of multiple distributions, and the multi-item distribution is the generalization of the Bernoulli distribution. Comparing exponential distribution family definitions, you can find:

The result proves that the multiple distributions belong to the exponential distribution family.

Function 1{·} Refer to the previous article in this series, which is not mentioned here. See also: http://www.cnblogs.com/tbcaaa8/p/4486297.html

3. Generalized linear model

The generalized linear model is based on the following three-point hypothesis:

Suppose that a y (i) |x (i) is independent of each other and satisfies the distribution of the same exponential distribution family

hypothesis two E (T (Y (i)) |x (i)) is the parameter of the distribution that Y (i) |x (i) satisfies

assume that the three models are linear, i.e. Η=ΘTX

Based on the above hypothesis, the logarithmic likelihood function in a generalized linear model can be expressed as follows:

The maximum value of the likelihood function is as follows:

In linear regression, logistic regression, poisson regression and Softmax regression, the relevant parameters are brought into the equation to obtain the derivative result of the loss function (the meaning of the loss function itself is less than the meaning of its derivative function). The gradient descent (ascending) method can be used to solve the problem, or it can be solved directly by Newton method.

Note: Linear regression due to the existence of parameter σ2, the processing process is slightly different, but the results are consistent; the difference between the equation and the derivative of the loss function may be a sign, which is related to the definition of loss function.

At this point, the generalized linear model problem is basically solved, but there are still some detail problems left. For example, what are the hypothetical functions hθ (x) mentioned in linear regression, logistic regression, Poisson regression, and Softmax regression?

In the exponential distribution family, the unknown parameter is η, and the parameter we want to find is a weight vector θ. The role of hθ (x) is formally associated with the two, and is therefore called a join function. In a generalized linear model, the parameter η is actually a function of a probability distribution parameter (such as the Gaussian distribution parameter μ, the Bernoulli distribution parameter φ, etc.), such as η=η(μ) and so on, and the connection function is its inverse function, namely μ=η-1(η). After the inverse function is solved, the Η=ΘTX is brought into it to get hθ (x).

So, for a given regression model, is the selection of the Join function unique?

In general, you should choose η-1 as the connection function. But it is also possible to choose a similar function as a join function, for example, in logistic regression, you can choose the hyperbolic tangent function instead of the sigmoid function, but at this time the regression model is no longer the standard regression model.

Machine Learning Study Notes (4)--Regression problem Summary: Generalized linear model

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine Learning Study Notes (4)--Regression problem Summary: Generalized linear model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine Learning Study Notes (4)--Regression problem Summary: Generalized linear model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support