Posterior Predictive Sampling
The goal of inference is often posterior prediction, that is evaluating or sampling from the posterior predictive distribution
This chapter explains how to sample from the posterior predictive distribution in Stan, including applications to posterior predictive simulation and calculating event probabilities. These techniques can be coded in Stan using random number generation in the generated quantities block. Further, a technique for fitting and performing inference in two stages is presented in a section on stand-alone generated quantities in Stan
Posterior predictive distribution
Given a full Bayesian model
Computing the posterior predictive distribution
The posterior predictive density (or mass) of a prediction
Computing directly using this formula will lead to underflow in many situations, but the log posterior predictive density,
Sampling from the posterior predictive distribution
Given draws from the posterior
Randomly drawing
Posterior predictive simulation in Stan
Posterior predictive quantities can be coded in Stan using the generated quantities block.
Simple Poisson model
For example, consider a simple Poisson model for count data with a rate parameter
Stan code
The following Stan program defines a variable for
data {
int<lower=0> N;
array[N] int<lower=0> y;
}parameters {
real<lower=0> lambda;
}model {
1, 1);
lambda ~ gamma(
y ~ poisson(lambda);
}generated quantities {
int<lower=0> y_tilde = poisson_rng(lambda);
}
The random draw from the data model for
The posterior draws
generated quantities {
real f_val = f(y_tilde, theta);
// ...
}
where the value of f_val
, then the posterior mean of f_val
will be the expectation
Analytic posterior and posterior predictive
The gamma distribution is the conjugate prior distribution for the Poisson distribution, so the posterior density
Because the posterior follows a gamma distribution and the sampling distribution is Poisson, the posterior predictive
Posterior prediction for regressions
Posterior predictive distributions for regressions
Consider a regression with a single predictor
Stan program
The following program defines a Poisson regression with a single predictor. These predictors are all coded as data, as are their sizes. Only the observed
data {
int<lower=0> N;
vector[N] x;
array[N] int<lower=0> y;
int<lower=0> N_tilde;
vector[N_tilde] x_tilde;
}parameters {
real alpha;
real beta;
}model {
y ~ poisson_log(alpha + beta * x);0, 1);
{ alpha, beta } ~ normal(
}generated quantities {
array[N_tilde] int<lower=0> y_tilde
= poisson_log_rng(alpha + beta * x_tilde); }
The Poisson distributions in both the model and generated quantities block are coded using the log rate as a parameter (that’s poisson_log
vs. poisson
, with the suffixes defining the scale of the parameter). The regression coefficients, an intercept alpha
and slope beta
, are given standard normal priors.
In the model block, the log rate for the Poisson is a linear function of the training data
The result from running Stan is a predictive sample
The mean of the posterior predictive distribution is the expected value y_tilde[n]
after running Stan is the expected value of
All of this can be carried out by running Stan only a single time to draw a single sample of
Estimating event probabilities
Event probabilities involving either parameters or predictions or both may be coded in the generated quantities block. For example, to evaluate
generated quantities {
int<lower=0, upper=1> lambda_gt_5 = lambda > 5;
// ...
}
The value of the expression lambda > 5
is 1 if the condition is true and 0 otherwise. The posterior mean of this parameter is the event probability lambda_gt_5
.
In general, event probabilities may be expressed as expectations of indicator functions. For example,
Event probabilities involving posterior predictive quantities
Stand-alone generated quantities and ongoing prediction
Stan’s sampling algorithms take a Stan program representing a posterior
By saving the posterior draws, predictions for new data items
data {
int<lower=0> N;
vector[N] x;
vector[N] y;
}parameters {
real alpha;
real beta;
real<lower=0> sigma;
}model {
y ~ normal(alpha + beta * x, sigma);0, 5);
alpha ~ normal(0, 1);
beta ~ normal(0, 0.5);
sigma ~ lognormal( }
A second program can be used to generate new observations. This follow-on program need only declare the parameters as they were originally defined. This may require defining constants in the data block such as sizes and hyperparameters that are involved in parameter size or constraint declarations. Then additional data is read in corresponding to predictors for new outcomes that have yet to be observed. There is no need to repeat the model or unneeded transformed parameters or generated quantities. The complete follow-on program for prediction just declares the predictors in the data, the original parameters, and then the predictions in the generated quantities block.
data {
int<lower=0> N_tilde;
vector[N_tilde] x_tilde;
}parameters {
real alpha;
real beta;
real<lower=0> sigma;
}generated quantities {
vector[N_tilde] y_tilde
= normal_rng(alpha + beta * x_tilde, sigma); }
When running stand-alone generated quantities, the inputs required are the original draws for the parameters and any predictors corresponding to new predictions, and the output will be draws for
Any posterior predictive quantities desired may be generated this way. For example, event probabilities are estimated in the usual way by defining indicator variables in the generated quantities block.