27.4 Prior predictive checks

This is an old version, view current version.

Prior predictive checks generate data according to the prior in order to asses whether a prior is appropriate (Gabry et al. 2019). A posterior predictive check generates replicated data according to the posterior predictive distribution. In contrast, the prior predictive check generates data according to the prior predictive distribution, $y^{\textrm{sim}} \sim p(y).$ The prior predictive distribution is just like the posterior predictive distribution with no observed data, so that a prior predictive check is nothing more than the limiting case of a posterior predictive check with no data.

This is easy to carry out mechanically by simulating parameters $\theta^{\textrm{sim}} \sim p(\theta)$ according to the priors, then simulating data $y^{\textrm{sim}} \sim p(y \mid \theta^{\textrm{sim}})$ according to the sampling distribution given the simulated parameters. The result is a simulation from the joint distribution, $(y^{\textrm{sim}}, \theta^{\textrm{sim}}) \sim p(y, \theta)$ and thus $y^{\textrm{sim}} \sim p(y)$ is a simulation from the prior predictive distribution.

27.4.1 Coding prior predictive checks in Stan

A prior predictive check is coded just like a posterior predictive check. If a posterior predictive check has already been coded and it’s possible to set the data to be empty, then no additional coding is necessary. The disadvantage to coding prior predictive checks as posterior predictive checks with no data is that Markov chain Monte Carlo will be used to sample the parameters, which is less efficient than taking independent draws using random number generation.

Prior predictive checks can be coded entirely within the generated quantities block using random number generation. The resulting draws will be independent. Predictors must be read in from the actual data set—they do not have a generative model from which to be simulated. For a Poisson regression, prior predictive sampling can be encoded as the following complete Stan program.

data {
  int<lower = 0> N;
  vector[N] x;
}
generated quantities {
  real alpha = normal_rng(0, 1);
  real beta = normal_rng(0, 1);
  real y_sim[N] = poisson_log_rng(alpha + beta * x);
}

Running this program using Stan’s fixed-parameter sampler yields draws from the prior. These may be plotted to consider their appropriateness.

References

Gabry, Jonah, Daniel Simpson, Aki Vehtari, Michael Betancourt, and Andrew Gelman. 2019. “Visualization in Bayesian Workflow.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 182 (2): 389–402.