Poisson regression model

Source: Internet
Author: User

The Poisson regression model is also a method used to analyze the list and classification data, which is actually one of the logarithmic linear models, and the difference is that the logarithm linear model assumes the frequency.
Distribution is a polynomial distribution, and the Poisson regression model assumes that the frequency distribution is Poisson distribution.

First, let's get to know the Poisson distribution:

The concept and practical significance of Poisson distribution:

We know that the two-item distribution is the most important of the discrete probability distributions, while the limit form of the two-item distribution is the Poisson distribution (p is very small, n is very large), is also a very important

In the real world, many accidental phenomena can be described by Poisson distribution.

Poisson distribution thinks: If the probability of occurrence of certain phenomena p is very small, and the sample number n is very large, then the two-item distribution approximation Poisson distribution. So the Poisson distribution is derived from two distributions
The specific derivation process is as follows:


So the probability function of the Poisson distribution is


If a random variable x takes the probability of k to conform to the above formula, it is said that X obeys the Poisson distribution of the parameter λ.

Let's explain the derivation process by combining two distributions:

If the probability of success is p, then try to do it independently n times, and the distribution of the number of successes matches two distributions. In n trials, the number of successes

It could be 0 times, 1 times, 2 times ... n times, the probability of success for each test is p, the probability of unsuccessful is 1-p, and the successful K-test can be randomly distributed in a total of n trials.
, multiplying them is the probability of a successful K-th, which is the

Then let's think about it: at any given time, something will happen randomly at any moment. When we divide this time period into very small n time slices (n-+∞) and make the following assumptions:

1. Each time the on-chip event is independent, unrelated to whether or not, it is equivalent to an independent test.

2. Due to n-+∞, it is not possible to have an event two or more in a time slice of such a small 1/n.

3. The probability of the occurrence of the event in each time slice p and the number of time slices N of the product n*p=λ, is a constant, which indicates the frequency of the event during this time period,
It is called the total mean, the total number of occurrences, and so on, which is the above order p=λ/n

In combination with the above explanations, we can understand the idea of the Poisson distribution deduced from two distributions, and if the Poisson distribution is explained in the language of probability, it can be described as: if an event
The total number of occurrences is λ, then in N independent experiments, the event occurred K-time probability distribution.

Poisson distribution can be seen as a special case of two distributions, for n large and small p tests, it is cumbersome to use two-item distributions, which can be simplified to calculate the Poisson distribution
, and the Poisson distribution is well suited to describe the probability distribution of the number of random events occurring per unit of time, and it will take the number of occurrences of this original discrete data, and time together
, thus forming a probability distribution similar to continuity, and the two-item distribution is mainly to study the probability distribution of n discrete events.

Ii. Conditions of Poisson distribution

1.N is big and p is small.
2. The occurrence of events is independent of each other, the probability of occurrence of each event is equal
3. Event is two categorical data

In fact, the above 2, 3 points are also the two terms of distribution

The properties of Poisson distribution

1. The overall mean λ and variance of the Poisson distribution are equal
2. When λ is small, the Poisson distribution is biased, and as λ increases, the Poisson distribution asymptotically normal distribution, can do normal distribution, note that this asymptotic speed is very fast.
3. Poisson distribution has an additive

========================================

After describing the Poisson distribution, we describe the Poisson regression model.

With an explanatory variable x, you can write the following regression model
G (μ) =α+β0+β1x
The G is the join function, and if it is logarithmic, the
ln (μ) =α+β0+β1x
The structure of this model is very similar to the regression model, if the dependent variable y obeys the Poisson distribution, then the model is called the Poisson regression model.
The Poisson regression model is a regression model that describes the mean μ of the dependent variable y which obeys the Poisson distribution and the X1...XM relation of the covariance variables.
If the observed cardinality of events within each cell is different, it needs to be converted to the same cardinality for analysis
ln (μ/n) =α+β0+β1x
n indicates the number of units of observation for the corresponding cell
After deformation of the upper-
ln (μ) =ln (n) +α+β0+β1x
This ln (n) is called an offset and is used to remove the effect of unequal observation units.

The parametric estimation of the Poisson regression model also uses iterative repetitive weighted least squares irls or maximum likelihood estimates.

Poisson regression model

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.