EM algorithm [GO]

Source: Internet
Author: User

Maximum expectation algorithm: EM algorithm.

In statistical calculation, the maximal expectation algorithm (EM) is an algorithm for finding the maximum likelihood estimation or the maximum posterior estimation in the probabilistic model, in which the probabilistic model relies on the invisible hidden variables.

The maximum expectation algorithm is calculated by alternating two steps:

The first step is to calculate the expectation (E), using the existing estimates of the hidden variables to calculate their maximum likelihood estimates;

The second step is to maximize (M), maximizing the maximum likelihood value calculated on the E step to calculate the value of the parameter.

The parameter estimates found on M-step are used in the next e-step calculation, and the process is constantly alternating.

Overall, the EM algorithm flows as follows:

1. Initialize the distribution parameters

2. Repeat until Convergence:

E-Step: estimate the expected value of the unknown parameter and give the current parameter estimation.

M Step: Re-estimate the parameter distribution so that the likelihood of the data is greatest, and the expected estimate of the unknown variable is given.
The EM algorithm is like this, assuming we know that A and b two parameters , in the beginning of both are unknown, and know the information of a can get B information, in turn know B also got a. Consider first giving a certain initial value, in order to get the estimate of B, and then starting from the current value of B, re-estimate the value of a, the process continues until convergence.

em Algorithm is a method of estimating the maximum likelihood of parameters, which can be used to estimate the mle of parameters from the incomplete data set, which is a very simple and practical learning algorithm.

The set z= (x, y) is assumed to consist of the observed data x and the observed data Y, and X and z=, respectively, are called incomplete data and complete data. It is assumed that the joint probability density of z is parameterized to P (x,y|θ), where Θ represents the parameter to be estimated. The maximum likelihood estimation of Θ is obtained from the maximum value of the logarithmic likelihood function L (x,θ) for incomplete data:

L (Θ; X) =log p (x|θ) =∫log p (x,y|θ) DY;

The EM algorithm consists of two steps: composed of e-step and M-step, which is to maximize the logarithmic likelihood function of incomplete data by iteratively maximizing The expectation of the logarithmic likelihood function Lc (x;θ) of the complete data , Where:Lc (x;θ) =log p (x,y|θ)

Suppose that the estimate of θ after the T-iteration of the algorithm is in θ (t), then at (t+1) iterations,

Step E: Calculate the expectation of the logarithmic likelihood function of the complete data, recorded as Q (θ|θ (t)) =E{LC (Θ; Z) | x;θ (t)};

M Step: The new Θ is obtained by maximizing Q (θ|θ (t)).

By alternately using these two steps, the EM algorithm gradually improves the parameters of the model, increasing the likelihood probability of the parameters and training samples, and finally terminates at a maximum point. Intuitively understand the EM algorithm, which can also be seen as a successive approximation algorithm:

Implementation does not know the parameters of the model, you can randomly select a set of parameters or a rough given initial parameter λ0, determine the most probable state corresponding to this set of parameters, calculate the probability of the possible results of each training sample, in the current state again by the sample parameter correction, re-estimate parameter λ, And the state of the model is re-determined under the new parameters, so that the parameters of the model can be approximated to the real parameters by iterative iteration and the loop until a certain convergence condition is satisfied.

The main purpose of the EM algorithm is to provide a simple iterative algorithm to calculate the posterior density function, its greatest advantage is simple and stable, but easy to fall into the local optimal

em Algorithm (expectation-maximization algorithm)

1. Introduction

The core idea of the EM algorithm is to estimate the likelihood function by using the hidden variables and the iteration between the expected values.

2. Mixed Gaussian model and EM algorithm

2.1, two-component mixed Gaussian EM algorithm

Assuming that there is data y, the density is now modeled with two Gaussian distributions, with the parameters. The density of Y is:

(1)

Parameters are: (2)

Log likelihood based on n training data is:

(3)

The direct maximum likelihood function is difficult because of the need of the logarithm likelihood function term. We introduce a latent variable with a value of 0 or 1, if It is taken from model 2, otherwise taken from model 1. The logarithmic likelihood function can be written as:

(4)

Then the maximum likelihood estimate will be the mean and variance of the sample of those data, and the maximum likelihood estimate will be the mean and variance of the sample of those data.

Since the values are actually unknown, they are processed in an iterative manner, substituting the followingfor each of the (4), i.e.:(5)

(5) is also known as Model 2 for the response of observation I.

The EM algorithm of two-component Gauss

    1. initializes the parameters, where the variance of the two sampled samples can be randomly selected:. Mixed proportions take 0.5.
    2. Desired step: Calculate the response degree:

The probability of characterizing the data.

3. max step: Calculate weighted Mean and variance:

and the mixed probability, which represents the sum of the probabilities of the data.

4. Repeat steps 2,3 until convergence.

2.2 Multi-component mixed Gaussian EM algorithm

multi-component Gauss EM algorithm

  1. Initialization parameters: mean, covariance matrix, and blend ratio
  2. Desired step: Calculate the response degree:

which k = ... N.

3. max step: Calculate weighted mean and covariance:

which

4. Calculate Log likelihood:

Check parameters and Log if the likelihood has been convergent, if there is no convergence, repeat steps 2.

3. General EM algorithm

Suppose a complete sample set D, where the sample is, is subject to a particular distribution, assuming that part of the data is lost. Complete data and lost data distribution bids for: and , and . Defining functions: (6)

(6) The left side of the formula is a function of θ, and θi assumes that the fixed value is taken, and the right side represents the expectation of the logarithmic likelihood function for the missing feature, where θi is the parameter describing the entire distribution. The universal EM algorithm can be written as:

4. Summary of EM algorithm

(1) Em converges to local extremum, but does not guarantee convergence to the global optimal

(2) sensitive to initial values: a good, fast initialization process is usually required

Results obtained by the moment method

In GMM, using K-means Clustering

(3) Suitable conditions

Missing data is not too long

When the data dimension is not too high (the data dimension is too high, the calculation of e-step is time-consuming)

EM algorithm [GO]

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.