One, EM algorithm
The EM algorithm is an iterative algorithm for maximum likelihood estimation of probabilistic model parameters containing implied variables. Set Y to observe the random variable data, z for the hidden random variable data, y and Z together called complete data.
The likelihood function of the observed data is:
The maximum likelihood estimate of the model parameter θ is:
This problem can only be solved by iterative solution, and the iterative solution process of EM algorithm is given below:
Step1, select the appropriate parameter initial value θ (0), start the iteration
Step2, E step, ask for hope. Θ (i) is the estimated value of the first iteration θ, and in step i+1, compute the following Q function:
The Q function is Logp (y,z|θ) about the conditional probability distribution P (z| of the hidden variable Z under the given observed data Y and current parameter θ (i) y,θ (i)) 's expectations.
Step3, M step, Max. To calculate the maximum of the Q function, determine the parameter estimates for the i+1 iteration:
STEP4, repeat 2nd, 3 steps, until convergence.
The EM algorithm is sensitive to the selection of initial value and can not guarantee the global optimal solution. application of Gaussian mixture model (GMM)
A Gaussian hybrid model:
Dovigos Hybrid Model:
WK (k=1,2,......,k) is a mixed-item coefficient, and is 1. ∑ is the covariance matrix. Θ= (wk,uk,σk).
With n observable data Yi, they are produced: first, according to the probability wk Select the K-Gaussian distribution model, the generation of observational data Yi. Yi is known, but Yi belongs to the J model which is unknown and is a hidden variable. Using Zij to represent a hidden variable means the probability that the first data belongs to the J model. First, the likelihood function of complete data is written out, then the Q function is determined, and the WK, UK and σk are biased and made 0. An EM algorithm for parameter estimation of a Gaussian mixture model (with high-dimensional data as an example):
Step1, parameter assignment initial value, start iteration
Step2, E-step, calculate the mixed coefficient zij expectation E[zij]:
Step3, M step, calculate the new iteration of the parameter model:
STEP4, repeat 2nd, 3 steps, until convergence. iii. python program examples
This sample program randomly generates 500 2-D data from 4 Gaussian models, real parameters: Mixed term w=[0.1,0.2,0.3,0.4], mean u=[[5,35],[30,40],[20,20],[45,15], and covariance matrix ∑=[[30,0],[ 0,30]]. Then using these data as observation data, we estimate the above parameters according to the EM algorithm (this program does not estimate the covariance matrix). The source code is as follows:
Import Math Import copy import NumPy as NP import Matplotlib.pyplot as PLT from Mpl_toolkits.mplot3d import axes3d #生成随机数 According to the 4 Gauss model def generate_data (sigma,n,mu1,mu2,mu3,mu4,alpha): Global x #可观测数据集 x = Np.zeros ((N, 2)) # initializes x,2 rows n columns. 2-D data, N-sample X=np.matrix (X) global mu #随机初始化mu1, mu2,mu3,mu4 mu = np.random.random ((4,2)) mu=n P.matrix (MU) global excep #期望第i个样本属于第j个模型的概率的期望 Excep=np.zeros ((n,4)) Global Alpha_ #
Initialize the mixed factor alpha_=[0.25,0.25,0.25,0.25] for I in range (N): If Np.random.random (1) < 0.1: # Generate random numbers between 0-1 X[i,:] = Np.random.multivariate_normal (MU1, Sigma, 1) #用第一个高斯模型生成2维数据 elif 0.1 <= Np.random.rand OM (1) < 0.3:x[i,:] = Np.random.multivariate_normal (MU2, Sigma, 1) #用第二个高斯模型生成2维数据 elif 0.3 &l
t;= Np.random.random (1) < 0.6:x[i,:] = Np.random.multivariate_normal (Mu3, Sigma, 1) #用第三个高斯模型生成2维数据 Else:x[i,:] = Np.random.multivariate_normal (Mu4, Sigma, 1) #用第四个高斯模型生成2维数据 print ("observable data: \ n", X) #输出可观测样本 print ("Initialized mu1,mu2,mu3,mu4:", Mu) #输出初始化的mu def e_step (sigma,k,n): Global X Global mu Glo Bal EXCEP Global Alpha_ for I in Range (N): Denom=0 to J in Range (0,k): Denom = Alpha _[j]*math.exp (-(X[I,:]-MU[J,:]) *sigma. I*np.transpose (X[i,:]-mu[j,:])/np.sqrt (Np.linalg.det (sigma)) #分母 for J in Range (0,k): Numer = m Ath.exp (-(X[I,:]-MU[J,:]) *sigma. I*np.transpose (X[i,:]-mu[j,:])/np.sqrt (Np.linalg.det (sigma)) #分子 Excep[i,j]=alpha_[j]*numer/denom #求期望 print ("Hidden variables: \ n", EXCEP) def m_step (k,n): Global EXCEP Global X Global Alpha_ for J-in range (0,K)
: denom=0 #分母 numer=0 #分子 for I in Range (N): Numer + = Excep[i,j]*x[i,:] Denom + + excep[i,j] mu[j,:] = numer/denom #求均值 alpha_[j]=denom/n #求混合项系数 If __name__ = = ' __main__ ': iter_num=1000 #迭代次数 n=500 #样本数目 k=4 # Gauss model number probility = Np.zeros (N) #混合高斯分布 u1=[5,35] u2=[30,40] u3=[20,20] u4=[45,15] Sigma=np.matri X ([[0], [0]]) #协方差矩阵 alpha=[0.1,0.2,0.3,0.4] #混合项系数 generate_data (sigma,n,u1,u2,u3,u
4,alpha) #生成数据 #迭代计算 for I in Range (iter_num): err=0 #均值误差 err_alpha=0 #混合项系数误差 Old_mu = Copy.deepcopy (mu) old_alpha = copy.deepcopy (Alpha_) E_step (sigma,k,n) # E step M_step (K, N) # m Step Print ("Iteration number:", i+1) print ("Estimated mean:", mu) print ("Estimated mixed factor:", Alpha_) for z in Range (k): Err + + (ABS (old_mu[z,0]-mu[z,0)) +abs (old_mu[z,1]-mu[z,1]) #计算误差 Err_alpha + + ABS ( OLD_ALPHA[Z]-ALPHA_[Z]) if (err<=0.001) and (err_alpha<0.001): #达到精度退出迭代 print (Err,err_alpha
) Break#可视化结果 # Draw the raw data generated by Plt.subplot (221) plt.scatter (x[:,0], x[:,1],c= ' B ', s=25,alpha=0.4,marker= ' o ') #T散点颜色, s scatter size , alpha transparency, marker scatter shape plt.title (' random generated data ') #画分类好的数据 Plt.subplot (222) plt.title (' Classified da
Ta through EM ') Order=np.zeros (n) color=[' B ', ' R ', ' K ', ' Y '] for I in range (n): for J in range (K): If Excep[i,j]==max (Excep[i,:]): Order[i]=j #选出X [I,:] belong to the first few Gaussian models probility[i] + = Alpha_ [Int (order[i])]*math.exp (-(X[I,:]-MU[J,:]) *sigma. I*np.transpose (X[i,:]-mu[j,:])/(Np.sqrt (Np.linalg.det (sigma)) *2*np.pi) #计算混合高斯分布 plt.scatter (x[i, 0], x[i, 1], C=color[int (Order[i])], s=25, alpha=0.4, marker= ' o ') #绘制分类后的散点图 #绘制三维图像 ax = plt.subplot (223, projection= ' 3d ') Plt.title (' 3d view ') for I in Range (N): Ax.scatter (x[i, 0], x[i, 1], probility[i], C=color[int (order[i) )] Plt.show ()
The results are as follows:
The coefficient of mixed items is estimated to be [0.46878064954123966, 0.087906620835838722, 0.25716577653788636, 0.18614695308503548]
The mean value is estimated to be [[45.20736093 15.47819894]
[3.74835753 34.93029857]
[19.97541696 20.26373867]
[29.91276386 39.87999686]]
The upper-left image is the generated observational data, the upper-right image is the result of the classification, the following figure is a three-dimensional visualization of the Gaussian mixture model.