24.2 Computing the posterior predictive distribution

This is an old version, view current version.

24.2 Computing the posterior predictive distribution

The posterior predictive density (or mass) of a prediction $\tilde{y}$ given observed data $y$ can be computed using Monte Carlo draws

$\theta^{(m)} \sim p(\theta \mid y)$ from the posterior as $p(\tilde{y} \mid y) \approx \frac{1}{M} \sum_{m = 1}^M p(\tilde{y} \mid \theta^{(m)}).$

Computing directly using this formula will lead to underflow in many situations, but the log posterior predictive density, $\log p(\tilde{y} \mid y)$ may be computed using the stable log sum of exponents function as $\begin{eqnarray*} \log p(\tilde{y} \mid y) & \approx & \log \frac{1}{M} \sum_{m = 1}^M p(\tilde{y} \mid \theta^{(m)}). \\[4pt] & = & - \log M + \textrm{log-sum-exp}_{m = 1}^M \log p(\tilde{y} \mid \theta^{(m)}), \end{eqnarray*}$ where $\textrm{log-sum-exp}_{m = 1}^M v_m = \log \sum_{m = 1}^M \exp v_m$ is used to maintain arithmetic precision. See the section on log sum of exponentials for more details.