This is an old version, view current version.

24.3 Sampling from the posterior predictive distribution

Given draws from the posterior \(\theta^{(m)} \sim p(\theta \mid y),\) draws from the posterior predictive \(\tilde{y}^{(m)} \sim p(\tilde{y} \mid y)\) can be generated by randomly generating from the sampling distribution with the parameter draw plugged in, \[ \tilde{y}^{(m)} \sim p(y \mid \theta^{(m)}). \]

Randomly drawing \(\tilde{y}\) from the sampling distribution is critical because there are two forms of uncertainty in posterior predictive quantities, sampling uncertainty and estimation uncertainty. Estimation uncertainty arises because \(\theta\) is being estimated based only on a sample of data \(y\). Sampling uncertainty arises because even a known value of \(\theta\) leads to a sampling distribution \(p(\tilde{y} \mid \theta)\) with variation in \(\tilde{y}\). Both forms of uncertainty show up in the factored form of the posterior predictive distribution, \[ p(\tilde{y} \mid y) = \int \underbrace{p(\tilde{y} \mid \theta)}_{\begin{array}{l} \textrm{sampling} \\[-2pt] \textrm{uncertainty} \end{array}} \cdot \underbrace{p(\theta \mid y)}_{\begin{array}{l} \textrm{estimation} \\[-2pt] \textrm{uncertainty} \end{array}} \, \textrm{d}\theta. \]