This is an old version, view current version.

## 26.7 Joint model representation

Following Gelman, Meng, and Stern (1996), prior, posterior, and mixed replications may all be defined as posteriors from joint models over parameters and observed and replicated data.

### 26.7.1 Posterior predictive model

For example, posterior predictive replication may be formulated using sampling notation as follows. $\begin{eqnarray*} \theta & \sim & p(\theta) \\[2pt] y & \sim & p(y \mid \theta) \\[2pt] y^{\textrm{rep}} & \sim & p(y \mid \theta) \end{eqnarray*}$ The heavily overloaded sampling notation is meant to indicate that both $$y$$ and $$y^{\textrm{rep}}$$ are drawn from the same distribution, or more formally using capital letters to distinguish random variables, that the conditional densities $$p_{Y^{\textrm{rep}} \mid \Theta}$$ and $$p_{Y \mid \Theta}$$ are the same.

The joint density is $p(\theta, y, y^{\textrm{rep}}) = p(\theta) \cdot p(y \mid \theta) \cdot p(y^{\textrm{rep}} \mid \theta).$ This again is assuming that the two distributions for $$y$$ and $$y^{\textrm{rep}}$$ are identical.

The variable $$y$$ is observed, with the predictive simulation $$y^{\textrm{rep}}$$ and parameter vector $$\theta$$ not observed. The posterior is $$p(y^{\textrm{rep}}, \theta \mid y)$$. Given draws from the posterior, the posterior predictive simulations $$y^{\textrm{rep}}$$ are retained.

### 26.7.2 Prior predictive model

The prior predictive model simply drops the data component of the posterior predictive model.
$\begin{eqnarray*} \theta & \sim & p(\theta) \\[2pt] y^{\textrm{rep}} & \sim & p(y \mid \theta) \end{eqnarray*}$ This corresponds to the joint density $p(\theta, y^{\textrm{rep}}) = p(\theta) \cdot p(y^{\textrm{rep}} \mid \theta).$

It is typically straightforward to draw $$\theta$$ from the prior and $$y^{\textrm{rep}}$$ from the sampling distribution given $$\theta$$ efficiently. In cases where it is not, the model may be coded and executed just as the posterior predictive model, only with no data.

### 26.7.3 Mixed replication for hierarchical models

The mixed replication corresponds to the model $\begin{eqnarray*} \phi & \sim & p(\phi) \\[2pt] \alpha & \sim & p(\alpha \mid \phi) \\[2pt] y & \sim & p(y \mid \alpha) \\[2pt] \alpha^{\textrm{rep}} & \sim & p(\alpha \mid \phi) \\[2pt] y^{\textrm{rep}} & \sim & p(y \mid \phi) \end{eqnarray*}$ The notation here is meant to indicate that $$\alpha$$ and $$\alpha^{\textrm{rep}}$$ have identical distributions, as do $$y$$ and $$y^{\textrm{rep}}$$.

This corresponds to a joint model $p(\phi, \alpha, \alpha^{\textrm{rep}}, y, y^{\textrm{rep}}) = p(\phi) \cdot p(\alpha \mid \phi) \cdot p(y \mid \alpha) \cdot p(\alpha^{\textrm{rep}} \mid \phi) \cdot p(y^{\textrm{rep}} \mid \alpha^{\textrm{rep}}),$ where $$y$$ is the only observed variable, $$\alpha$$ contains the lower-level parameters and $$\phi$$ the hyperparameters. Note that $$\phi$$ is not replicated and instead appears in the distribution for both $$\alpha$$ and $$\alpha^{\textrm{rep}}$$.

The posterior is $$p(\phi, \alpha, \alpha^{\textrm{rep}}, y^{\textrm{rep}} \mid y)$$. From posterior draws, the posterior predictive simulations $$y^{\textrm{rep}}$$ are kept.

### References

Gelman, Andrew, Xiao-Li Meng, and Hal Stern. 1996. “Posterior Predictive Assessment of Model Fitness via Realized Discrepancies.” Statistica Sinica, 733–60.