This is an old version, view current version.

27.6 Mixed predictive replication for hierarchical models

Andrew Gelman, Meng, and Stern (1996) discuss the case of mixed replication for hierarchical models in which the hyperparameters remain fixed, but varying effects are replicated. This is neither a purely prior nor purely posterior predictive check, but falls somewhere in between.

For example, consider a simple varying intercept logistic regression, with intercepts αk for k1:K. Each data item yn{0,1} is assumed to correspond to group kkn1:K. The sampling distribution is thus ynbernoulli(logit1(αkk[n])). The varying intercepts have a hierarchical normal prior, αknormal(μ,σ). The hyperparameters are themselves given weakly informative priors, μnormal(0,2)σlognormal(0,1).

Like in a posterior predictive check, the hyperparameters μ and σ are drawn from the posterior, μ(m),σ(m)p(μ,σy) Like in a prior predictive check, replicated values of α are drawn from the hyperparameters, αrep(m)knormal(αkμ(m),σ(m)). The data items are then each replicated using the replicated intercepts, yrep(m)nbernoulli(logit1(αrep(m)kk[n])). Thus the yrep(m) can be seen as a kind of posterior predictive replication of observations from new groups that were not among the original K groups.

In Stan, mixed predictive replications yrep(m) can be programmed directly.

data {
  int<lower = 0> K;
  int<lower = 0> N;
  int<lower = 1, upper = K> kk[N];
  int<lower = 0, upper = 1> y[N];
}
parameters {
  real mu;
  real<lower = 0> sigma;
  vector<offset = mu, multiplier = sigma>[K] alpha;
}
model {
  mu ~ normal(0, 2);               // hyperprior
  sigma ~ lognormal(0, 1);
  alpha ~ normal(mu, sigma);       // hierarchical prior
  y ~ bernoulli_logit(alpha[kk]);  // sampling distribution
}
generated quantities {
  // alpha replicated;  mu and sigma not replicated
  real alpha_rep[K]
    = normal_rng(rep_vector(mu, K), sigma);
  int<lower = 0, upper = 1> y_rep[N]
    = bernoulli_logit_rng(alpha_rep[kk]);
}

References

Gelman, Andrew, Xiao-Li Meng, and Hal Stern. 1996. “Posterior Predictive Assessment of Model Fitness via Realized Discrepancies.” Statistica Sinica, 733–60.