Stan User’s Guide

This is an old version, view current version.

26.7 Joint model representation

Following Gelman, Meng, and Stern (1996), prior, posterior, and mixed replications may all be defined as posteriors from joint models over parameters and observed and replicated data.

26.7.1 Posterior predictive model

For example, posterior predictive replication may be formulated using sampling notation as follows. \[\begin{eqnarray*} \theta & \sim & p(\theta) \\[2pt] y & \sim & p(y \mid \theta) \\[2pt] y^{\textrm{rep}} & \sim & p(y \mid \theta) \end{eqnarray*}\]

The heavily overloaded sampling notation is meant to indicate that both \(y\) and \(y^{\textrm{rep}}\) are drawn from the same distribution, or more formally using capital letters to distinguish random variables, that the conditional densities \(p_{Y^{\textrm{rep}} \mid \Theta}\) and \(p_{Y \mid \Theta}\) are the same.

The joint density is \[ p(\theta, y, y^{\textrm{rep}}) = p(\theta) \cdot p(y \mid \theta) \cdot p(y^{\textrm{rep}} \mid \theta). \] This again is assuming that the two distributions for \(y\) and \(y^{\textrm{rep}}\) are identical.

The variable \(y\) is observed, with the predictive simulation \(y^{\textrm{rep}}\) and parameter vector \(\theta\) not observed. The posterior is \(p(y^{\textrm{rep}}, \theta \mid y)\). Given draws from the posterior, the posterior predictive simulations \(y^{\textrm{rep}}\) are retained.

26.7.2 Prior predictive model

The prior predictive model simply drops the data component of the posterior predictive model.
\[\begin{eqnarray*} \theta & \sim & p(\theta) \\[2pt] y^{\textrm{rep}} & \sim & p(y \mid \theta) \end{eqnarray*}\]

This corresponds to the joint density \[ p(\theta, y^{\textrm{rep}}) = p(\theta) \cdot p(y^{\textrm{rep}} \mid \theta). \]

It is typically straightforward to draw \(\theta\) from the prior and \(y^{\textrm{rep}}\) from the sampling distribution given \(\theta\) efficiently. In cases where it is not, the model may be coded and executed just as the posterior predictive model, only with no data.

26.7.3 Mixed replication for hierarchical models

The mixed replication corresponds to the model \[\begin{eqnarray*} \phi & \sim & p(\phi) \\[2pt] \alpha & \sim & p(\alpha \mid \phi) \\[2pt] y & \sim & p(y \mid \alpha) \\[2pt] \alpha^{\textrm{rep}} & \sim & p(\alpha \mid \phi) \\[2pt] y^{\textrm{rep}} & \sim & p(y \mid \phi) \end{eqnarray*}\]

The notation here is meant to indicate that \(\alpha\) and \(\alpha^{\textrm{rep}}\) have identical distributions, as do \(y\) and \(y^{\textrm{rep}}\).

This corresponds to a joint model \[ p(\phi, \alpha, \alpha^{\textrm{rep}}, y, y^{\textrm{rep}}) = p(\phi) \cdot p(\alpha \mid \phi) \cdot p(y \mid \alpha) \cdot p(\alpha^{\textrm{rep}} \mid \phi) \cdot p(y^{\textrm{rep}} \mid \alpha^{\textrm{rep}}), \] where \(y\) is the only observed variable, \(\alpha\) contains the lower-level parameters and \(\phi\) the hyperparameters. Note that \(\phi\) is not replicated and instead appears in the distribution for both \(\alpha\) and \(\alpha^{\textrm{rep}}\).

The posterior is \(p(\phi, \alpha, \alpha^{\textrm{rep}}, y^{\textrm{rep}} \mid y)\). From posterior draws, the posterior predictive simulations \(y^{\textrm{rep}}\) are kept.

References

Gelman, Andrew, Xiao-Li Meng, and Hal Stern. 1996. “Posterior Predictive Assessment of Model Fitness via Realized Discrepancies.” Statistica Sinica, 733–60.