Layered Bayesian Model--structure

Last Update:2016-07-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Layered Bayesian model

For a random variable sequence $y_{1},..., y_{n} $, if in any sort order $\pi $, its probability density satisfies $p (y_{1},..., Y_{n}) =p (Y_{\pi_{1}},..., Y_{\pi_{n}) $, Then the variables are called exchangeable. When we lack the information to differentiate these random variables, the commutative nature is a reasonable attribute of $p (Y_{1},..., y_{n}) $. In this case, each random variable can be considered as a result of independent sampling from a group, and the properties of the group can be described by a fixed unknown parameter $\phi $, i.e.:

$$
\phi\sim P (\phi)
$$

$$
\{y_{1},..., y_{n}|\phi\}\sim^{i.i.d.}p (Y|\phi)
$$

Consider the hierarchical data $\{y_{1},..., y_{n}\} $, where $y_{j}=\{y_{1,j},..., y_{n_{j},j}\} $, then there are

$$
\{Y_{1,J},..., y_{n_{j},j}|\phi_{j}\}\sim^{i.i.d.}p (Y|\phi_{j})
$$

But how do we represent the group parameter $\phi_{1},..., \phi_{m} $? If these groups themselves belong to a larger group, then these parameter variables also satisfy the commutative nature, so there are

$$
\{\phi_{1},..., \phi_{m}|\phi\}\sim^{i.i.d.}p (\PHI|\PSI)
$$

In conclusion, we are able to get three probability distributions

In-Group sampling: $\{y_{1,j},..., y_{n_{j},j}|\phi_{j}\}\sim^{i.i.d.}p (Y|\phi_{j}) $

Inter-group sampling: $\{\phi_{1},..., \phi_{m}|\phi\}\sim^{i.i.d.}p (\PHI|\PSI) $

Prior distribution: $\psi \sim P (\psi) $

Layered Normal distribution model

In the following, the hierarchical normal distribution model is used to describe the mean heterogeneity among several groups, both intra-group and inter-group sampling are subject to normal distribution:

Intra-group model: $\phi_{j}=\{\theta_{j},\sigma^2\},\; P (Y|\phi_{j}) =normal (\theta_{j},\sigma^2) $

Inter-group model: $\psi=\{\mu,\tau^2\},\; P (\theta_{j}|\psi) =normal (\mu,\tau^2) $

The unknown parameters fixed in the model are, and, for convenience, we assume that these parameters conform to the standard semi-conjugate normal distribution and inverse gamma distribution:

$1/\sigma^2 \sim Gamma (\NU_{0}/2,\NU_{0}\SIGMA_{0}^2/2) $

$1/\tau^2 \sim Gamma (\eta_{0}/2,\eta_{0} \TAU_{0}^2/2) $

$\mu \sim Normal (\mu_{0}, \gamma_{0}^2) $

The model structure is as follows:

Post-mortem inference:

Important conclusions of one-dimensional normal model:

Conclusion 1 Assume that the sampling model is $\{y_{1},..., y_{n}|\theta,\sigma^2\}\sim^{i.i.d.}normal (\theta,\sigma^2) $ if $\theta \sim normal (\mu_{0},\tau_{0}^2) $,$1/\sigma^2 \sim Gamma (\NU_{0}/2,\NU_{0}\SIGMA_{0}^2/2) $, then $p (\theta|\sigma^2,y_{1},..., Y_ {n}) \sim Normal (\mu_{n},\tau_{n}^2) $, where $\mu_{n}=\frac{\mu_{0}/\tau{0}^2+n\bar{y}/\sigma^2}{1/\tau^2+n/\sigma^2} $,$\ Tau_{n}^2=\big (\frac{1}{\tau_{0}^2}+\frac{n}{\sigma^2}\big) ^{-1} $

Conclusion 2 Assume that the sampling model is $\{y_{1},..., y_{n}|\theta,\sigma^2\}\sim^{i.i.d.}normal (\theta,\sigma^2) $ if $\theta \sim normal (\mu_{0},\tau_{0}^2) $,$1/\sigma^2 \sim Gamma (\NU_{0}/2,\NU_{0}\SIGMA_{0}^2/2) $, then $p (\sigma^2|\theta,y_{1},..., Y_ {n}) \sim Inverse-gamma (\nu_{n}/2,\nu_{n}\sigma_{n}^2 (\theta)/2) $, wherein $\nu_{n}=\nu_{0}+n $,$\sigma_{n}^2 (\theta) =\frac{ 1}{\nu_{n}[\nu_{0}\sigma_{0}^2+ns_{n}^2 (\theta)]} $, $s _{n}^2 (\theta) =\sum (Y_{i}-\theta) ^2/n $

Unknown parameters in the system include the group mean value $ (\theta_{1},..., \theta_{m}) $ and the variance $\sigma^2 $ and the inter-group mean value $\MU $ and Variance $\tau^2 $, the joint posteriori inference of the parameters can be estimated by constructing the $gibbs $ sampler P (\theta,..., \theta,\mu,\tau^2,\sigma^2|y_{1},..., y_{m}) $, $Gibbs $ sampler is calculated by iterating through the full conditional distribution of each parameter.

$$
\begin{aligned}
&p (\theta_{1},..., \theta_{m},\mu,\tau^2,\sigma^2|y_{1},..., y_{m}) \ \
&\propto p (\mu,\tau^2,\sigma^2) \times p (\theta_{1},..., \theta_{m}|\mu,\tau^2,\sigma^2) \times p (y_{1},..., y_{m }|\theta_{1},..., \theta_{m},\mu,\tau^2,\sigma^2) \ \
&=p (\MU) p (\tau^2) p (\sigma^2) \big \{\prod_{j=1}^m P (\theta_j|\mu,\tau^2) \big\} \big\{\prod_{j=1}^m \prod_{i=1}^ n P (y_{i,j}|\theta_j,\sigma^2) \big\}
\end{aligned}
$$

Based on the dependence of random variables, we can get the full condition distribution of each variable.

$$
P (\mu|\theta_{1},..., \theta_{m},\tau^2,\sigma^2,y_{1},..., y_{m}) \propto P (\mu) \prod p (\theta_{j}|\mu,\tau^2)
$$

$$
P (\tau^2|\theta_{1},..., \theta_{m},\tau^2,\sigma^2,y_{1},..., y_{m}) \propto P (\tau^2) \prod p (\theta_{j}|\mu,\tau^ 2)
$$

$$
P (\theta_{j}|\mu,\tau^2,\sigma^2,y_{1},..., y_{m}) \propto P (\theta_{j}|\mu,\tau^2) \prod_{i=1}^{n_{j}} p (y_{i,j}|\ THETA_{J},\SIGMA^2)
$$

$$
\begin{aligned}
P (\sigma^2|\theta_{1},..., \theta_{m},y_{1},..., y_{m}) &\propto P (\sigma^2) \prod_{j=1}^m \prod_{i=1}^{n_{j}}p ( y_{i,j}|\theta_{j},\sigma^2) \ \
&\propto (\sigma^2) ^{-\nu_{0}/2+1}e^{-\frac{\nu_{0}\sigma_{0}^2}{2\sigma^2}} (\sigma^2) ^{-\sum n_{j}/2}e^{-\ Frac{\sum \sum (Y_{i,j}-\theta_{j}) ^2}{2\sigma^2}}
\end{aligned}
$$

So according to the above two conclusions, we can get:

$$
\{\mu|\theta_{1},..., \theta_{m},\tau^2\}\sim normal \big (\frac{m\bar{\theta}/\tau^2+\mu_{0}/\gamma_{0}^2}{m/\tau ^2+1/\GAMMA_{0}^2},[M/\TAU^2+1/\GAMMA_{0}^2]^{-1} \big)
$$

$$
\{1/\tau^2|\theta_{1},..., \theta_{m},\mu\}\sim Gamma \big (\frac{\eta_{0}+m}{2},\frac{\eta_{0}\tau_{0}^2+\sum (\ THETA_{J}-\MU) ^2}{2}\big)
$$

$$
\{\THETA_{J}|Y_{1,J},..., Y_{n,j},\sigma^2\}\sim Normal\big (\frac{n_{j}\bar{y}_{j}/\sigma^2+1/\tau^2}{n_{j}/\ SIGMA^2+1/\TAU^2},[N_{J}/\SIGMA^2+1/\TAU^2]^{-1}\BIG)
$$

$$
\{1/\sigma^2|\theta,y_{1},..., Y_{n}\sim Gamma \big (\frac{1}{2}[\nu_{0}+\sum_{j=1}^m n_{j}],\frac{1}{2}[\nu_{0}\ Sigma_{0}^2+\sum_{j=1}^m \sum_{i=1}^{n_{j}} (Y_{i,j}-\theta_{j}) (^2]\big) \}
$$

Calculation process:

Setting prior distribution parameters

$ (\nu_{0},\sigma_{0}^2) \rightarrow P (\sigma^2) $

$ (\eta_{0},\tau_{0}^2) \rightarrow P (\tau_{0}^2) $

$ (\mu_{0},\gamma_{0}^2) \rightarrow P (\MU) $

2. Parameter posteriori estimation for each unknown parameter in the full conditional distribution, i.e. the current state of the given parameter $\{\theta_{1}^{(s)},..., \theta_{m}^{(s)},\mu^{(s)},\tau^{2 (s)},\sigma^{ 2 (s)}\} $, the new state is obtained in the following manner:

$sample: \;\mu^{(s+1)}\sim p (\mu|\theta_{1}^{(s)},..., \theta_{m}^{(s)},\tau^{2 (s)}) $

$sample: \;\tau^{2 (s+1)}\sim p (\tau^2|\theta_{1}^{(s)},..., \theta_{m}^{(s)},\mu^{(S+1)}) $

$sample: \;\sigma^{2 (s+1)}\sim p (\sigma^2|\theta_{1}^{(s)},..., \theta_{m}^{(s)},y_{1},..., y_{m}) $

$for \;each\;j\in\{1,..., m\},\;sample\;\theta_{j}^{(s+1)}\sim p (\theta_{j}|\mu^{(s+1)},\tau^{2 (s+1)},\sigma^{2 (s +1)},y_{i}) $

Until the parameter converges, the system parameters are obtained.

Further, if the mean between the groups is different, the variance between the groups is also different, at this point $\sigma_{j}^2 $ is the variance of the $j $ group, then our sampling model becomes: $\{y_{1,j},..., y_{n_{j},j}\}\sim^{i.i.d. The full conditional distribution of normal (\theta_{j},\sigma_{j}^2) $,$\theta_{j} $ is: $\{\theta_{j}|y_{1,j},..., Y_{n_{j},j},\sigma_{j}^2\}\sim Normal \big (\frac{n_{j}\bar{y}_{j}/\sigma_{j}^2+1/\tau^2}{n_{j}/\sigma_{j}^2+1/\tau^2},[n_{j}/\sigma_{j}^2+1/\ TAU^2]^{-1}\BIG) $

How to estimate $\sigma_{j}^2 $? Let us first assume that:

$$
\sigma_{1}^2,..., \sigma_{m}^2\sim^{i.i.d.}gamma (\NU_{0}/2,\NU_{0}\SIGMA_{0}^2/2)
$$

Its full-condition distribution is:

$$
\{1/\SIGMA_{J}^2|Y_{1,J},..., Y_{n_{j},j},\theta_{j}\}\sim Gamma \big ([\nu_{0}+n_{j}]/2,[\nu_{0}\sigma_{0}^2+\sum (Y_{i,j}-\theta_{j}) ^2]/2) \big)
$$

The value of $\sigma_{1}^2,..., \sigma_{m}^2 $ can also be solved using the $gibbs $ sampling iteration.

If $\nu_{0} $ and $\sigma_{0}^2 $ are fixed, $\sigma_{j}^2 $ is independent of each other, meaning that the value of $\sigma_{m}^2 $ cannot be $\sigma_{1}^2,..., \sigma_{m-1}^2 $ To estimate, but if $\sigma_{m}^2 $ is in a small sample size, we should consider using $\sigma_{1}^2,..., \sigma_{m-1}^2 $ to increase the estimate of $\sigma_{m}^2 $, what should we do? What we're actually going to do is we can put $\nu_{0} $ and $\sigma_{0}^2 $ as an estimate, the overall structure of the system is:

Thus our unknown parameters are: Intra-Group sample distribution $\{(\theta_{1},\sigma_{1}^2),..., (\theta_{m},\sigma_{m}^2) \} $, inter-group mean heterogeneity parameter $\{\mu,\tau^2\} $, Group inter-party heterogeneity parameters $\{\nu_{0},\sigma_{0}^2\} $,$\{\mu,\tau^2\} $ and $\{(\theta_{1},\sigma_{1}^2),..., (\theta_{m},\sigma_{m}^2 ) \} $ is given, and the estimate for $\{\nu_{0},\sigma_{0}^2\} $ is now discussed. Assuming that $\sigma_{0}^2 $ obeys a priori distribution of the Conjugate class, $p (\sigma_{0}^2) \sim Gamma (A, b) $, then there is

$$
P (\sigma_{0}^2|\sigma_{1}^2,..., \sigma_{m}^2,\nu_{0}) =dgamma (A+\frac{1}{2}m\nu_{0},b+\frac{1}{2}\sum_{j=1}^m (1 /\SIGMA_{J}^2))
$$

The conjugate priori distribution of the simple $\nu_{0} $ does not exist, but if we limit it to an integer, the problem becomes simple. Assuming that $\nu_{0} $ has a priori obey $\{1,2,... \} $ on the geometric distribution so that $p (\nu_{0}) \propto E^{-\alpha \nu_{0}} $, then

$$
\begin{aligned}
& P (\nu_{0}|\sigma_{0}^2,\sigma_{1}^2,..., \sigma_{m}^2) \ \
& \propto P (\nu_{0}) \times p (\sigma_{1}^2,..., \sigma_{m}^2|\nu_{0},\sigma_{0}^2) \ \
& \propto \big (\frac{(\NU_{0}\SIGMA_{0}^2/2) ^{\nu_{0}/2}}{\gamma (\NU_{0}/2)}\big) ^m \big (\prod_{j=1}^m \frac{1 }{\sigma_{j}^2}\big) ^{\nu_{0}/2-1}\times exp\{-\nu_{0} (\alpha+\frac{1}{2}\sigma_{0}^2\sum (1/\sigma_{j}^2)) \}
\end{aligned}
$$

Problems are solved.

References: Hoff, Peter d. A first course in Bayesian statistical methods. Springer Science & Business Media, 2009.

Layered Bayesian Model--structure

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Layered Bayesian Model--structure

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Layered Bayesian Model--structure

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support