Run Generated Quantities

The generated quantities block computes quantities of interest (QOIs) based on the data, transformed data, parameters, and transformed parameters. It can be used to:

  • generate simulated data for model testing by forward sampling

  • generate predictions for new data

  • calculate posterior event probabilities, including multiple comparisons, sign tests, etc.

  • calculating posterior expectations

  • transform parameters for reporting

  • apply full Bayesian decision theory

  • calculate log likelihoods, deviances, etc. for model comparison

The CmdStanModel class generate_quantities method is useful once you have successfully fit a model to your data and have a valid sample from the posterior and a version of the original model where the generated quantities block contains the necessary statements to compute additional quantities of interest.

By running the generate_quantities method on the new model with a sample generated by the existing model, the sampler uses the per-draw parameter estimates from the sample to compute the generated quantities block of the new model.

The generate_quantities method returns a CmdStanGQ object which provides properties to retrieve information about the sample:

  • chains

  • column_names

  • generated_quantities

  • generated_quantities_pd

  • sample_plus_quantities

  • save_csvfiles()

The sample_plus_quantities combines the existing sample and new quantities of interest into a pandas DataFrame object which can be used for downstream analysis and visualization. In this way you add more columns of information to an existing sample.


  • mcmc_sample - either a CmdStanMCMC object or a list of stan-csv files

  • data: Values for all data variables in the model, specified either as a dictionary with entries matching the data variables, or as the path of a data file in JSON or Rdump format.

  • seed: The seed for random number generator.

  • gq_output_dir: A path or file name which will be used as the basename for the CmdStan output files.

Example: add posterior predictive checks to bernoulli.stan

In this example we use the CmdStan example model bernoulli.stan and data file as our existing model and data. We create the program bernoulli_ppc.stan by adding a generated quantities block to bernoulli.stan which generates a new data vector y_rep using the current estimate of theta.

generated quantities {
  int y_sim[N];
  real<lower=0,upper=1> theta_rep;
  for (n in 1:N)
    y_sim[n] = bernoulli_rng(theta);
  theta_rep = sum(y) / N;

The first step is to fit model bernoulli to the data:

import os
from cmdstanpy import CmdStanModel, cmdstan_path

bernoulli_dir = os.path.join(cmdstan_path(), 'examples', 'bernoulli')
bernoulli_path = os.path.join(bernoulli_dir, 'bernoulli.stan')
bernoulli_data = os.path.join(bernoulli_dir, '')

# instantiate, compile bernoulli model
bernoulli_model = CmdStanModel(stan_file=bernoulli_path)

# fit the model to the data
bern_fit = bernoulli_model.sample(data=bernoulli_data)

Then we compile the model bernoulli_ppc and use the fit parameter estimates to generate quantities of interest:

bernoulli_ppc_model = CmdStanModel(stan_file='bernoulli_ppc.stan')
new_quantities = bernoulli_ppc_model.generate_quantities(data=bern_data, mcmc_sample=bern_fit)

The generate_quantities method returns a CmdStanGQ object which contains the values for all variables in the generated quantitites block of the program bernoulli_ppc.stan. Unlike the output from the sample method, it doesn’t contain any information on the joint log probability density, sampler state, or parameters or transformed parameter values.

for i in range(len(new_quantities.column_names)):

The method sample_plus_quantities returns a pandas DataFrame which combines the input drawset with the generated quantities.

sample_plus = new_quantities.sample_plus_quantities