Program Blocks
A Stan program is organized into a sequence of named blocks, the bodies of which consist of variable declarations, followed in the case of some blocks with statements.
Overview of Stan’s program blocks
The full set of named program blocks is exemplified in the following skeletal Stan program.
functions {
// ... function declarations and definitions ...
}data {
// ... declarations ...
}transformed data {
// ... declarations ... statements ...
}parameters {
// ... declarations ...
}transformed parameters {
// ... declarations ... statements ...
}model {
// ... declarations ... statements ...
}generated quantities {
// ... declarations ... statements ...
}
The function-definition block contains user-defined functions. The data block declares the required data for the model. The transformed data block allows the definition of constants and transforms of the data. The parameters block declares the model’s parameters — the unconstrained version of the parameters is what’s sampled or optimized. The transformed parameters block allows variables to be defined in terms of data and parameters that may be used later and will be saved. The model block is where the log probability function is defined. The generated quantities block allows derived quantities based on parameters, data, and optionally (pseudo) random number generation.
Optionality and ordering
All of the blocks are optional. A consequence of this is that the empty string is a valid Stan program, although it will trigger a warning message from the Stan compiler. The Stan program blocks that occur must occur in the order presented in the skeletal program above. Within each block, both declarations and statements are optional, subject to the restriction that the declarations come before the statements.
Variable scope
The variables declared in each block have scope over all subsequent statements. Thus a variable declared in the transformed data block may be used in the model block. But a variable declared in the generated quantities block may not be used in any earlier block, including the model block. The exception to this rule is that variables declared in the model block are always local to the model block and may not be accessed in the generated quantities block; to make a variable accessible in the model and generated quantities block, it must be declared as a transformed parameter.
Variables declared as function parameters have scope only within that function definition’s body, and may not be assigned to (they are constant).
Function scope
Functions defined in the function block may be used in any appropriate block. Most functions can be used in any block and applied to a mixture of parameters and data (including constants or program literals).
Random-number-generating functions are restricted to transformed data and generated quantities blocks, and within user-defined functions ending in _rng
; such functions are suffixed with _rng
. Log-probability modifying functions to blocks where the log probability accumulator is in scope (transformed parameters and model); such functions are suffixed with _lp
.
Density functions defined in the program may be used in distribution statements.
Automatic variable definitions
The variables declared in the data
and parameters
block are treated differently than other variables in that they are automatically defined by the context in which they are used. This is why there are no statements allowed in the data or parameters block.
The variables in the data
block are read from an external input source such as a file or a designated R data structure. The variables in the parameters
block are read from the sampler’s current parameter values (either standard HMC or NUTS). The initial values may be provided through an external input source, which is also typically a file or a designated R data structure. In each case, the parameters are instantiated to the values for which the model defines a log probability function.
Transformed variables
The transformed data
and transformed parameters
block behave similarly to each other. Both allow new variables to be declared and then defined through a sequence of statements. Because variables scope over every statement that follows them, transformed data variables may be defined in terms of the data variables.
Before generating any draws, data variables are read in, then the transformed data variables are declared and the associated statements executed to define them. This means the statements in the transformed data block are only ever evaluated once.1
Transformed parameters work the same way, being defined in terms of the parameters, transformed data, and data variables. The difference is the frequency of evaluation. Parameters are read in and (inverse) transformed to constrained representations on their natural scales once per log probability and gradient evaluation. This means the inverse transforms and their log absolute Jacobian determinants are evaluated once per leapfrog step. Transformed parameters are then declared and their defining statements executed once per leapfrog step.
Generated quantities
The generated quantity variables are defined once per sample after all the leapfrog steps have been completed. These may be random quantities, so the block must be rerun even if the Metropolis adjustment of HMC or NUTS rejects the update proposal.
Variable read, write, and definition summary
A table summarizing the point at which variables are read, written, and defined is given in the block actions table.
Block Actions Table. The read, write, transform, and evaluate actions and periodicities listed in the last column correspond to the Stan program blocks in the first column. The middle column indicates whether the block allows statements. The last row indicates that parameter initialization requires a read and transform operation applied once per chain.
block | statement | action / period |
---|---|---|
data |
no | read / chain |
transformed data |
yes | evaluate / chain |
parameters |
no | inv. transform, Jacobian / leapfrog |
inv. transform, write / sample | ||
transformed parameters |
yes | evaluate / leapfrog |
write / sample | ||
model |
yes | evaluate / leapfrog step |
generated quantities |
yes | eval / sample |
write / sample | ||
(initialization) |
n/a | read, transform / chain |
Variable Declaration Table. This table indicates where variables that are not basic data or parameters should be declared, based on whether it is defined in terms of parameters, whether it is used in the log probability function defined in the model block, and whether it is printed. The two lines marked with asterisks (\(*\)) should not be used as there is no need to print a variable every iteration that does not depend on the value of any parameters.
param depend | in target | save | declare in |
---|---|---|---|
+ | + | + | transformed parameters |
+ | + | - | model (local) |
+ | - | + | generated quantities |
+ | - | - | generated quantities (local) |
- | + | + | transformed data and generated quantities |
- | + | - | transformed data |
- | - | + | generated quantities |
- | - | - | transformed data (local) |
Another way to look at the variables is in terms of their function. To decide which variable to use, consult the charts in the variable declaration table. The last line has no corresponding location, as there is no need to print a variable every iteration that does not depend on parameters.2
The rest of this chapter provides full details on when and how the variables and statements in each block are executed.
Statistical variable taxonomy
Statistical Variable Taxonomy Table. Variables of the kind indicated in the left column must be declared in one of the blocks declared in the right column.
variable kind | declaration block |
---|---|
constants | data , transformed data |
unmodeled data | data , transformed data |
modeled data | data , transformed data |
missing data | parameters , transformed parameters |
modeled parameters | parameters , transformed parameters |
unmodeled parameters | data , transformed data |
derived quantities | transformed data , transformed parameters , generated quantities |
loop indices | loop statement |
Page 366 of (Gelman and Hill 2007) provides a taxonomy of the kinds of variables used in Bayesian models. The table of kinds of variables contains Gelman and Hill’s taxonomy along with a missing-data kind along with the corresponding locations of declarations and definitions in Stan.
Constants can be built into a model as literals, data variables, or as transformed data variables. If specified as variables, their definition must be included in data files. If they are specified as transformed data variables, they cannot be used to specify the sizes of elements in the data
block.
The following program illustrates various variables kinds, listing the kind of each variable next to its declaration.
data {
int<lower=0> N; // unmodeled data
array[N] real y; // modeled data
real mu_mu; // config. unmodeled param
real<lower=0> sigma_mu; // config. unmodeled param
}transformed data {
real<lower=0> alpha; // const. unmodeled param
real<lower=0> beta; // const. unmodeled param
0.1;
alpha = 0.1;
beta =
}parameters {
real mu_y; // modeled param
real<lower=0> tau_y; // modeled param
}transformed parameters {
real<lower=0> sigma_y; // derived quantity (param)
0.5);
sigma_y = pow(tau_y, -
}model {
tau_y ~ gamma(alpha, beta);
mu_y ~ normal(mu_mu, sigma_mu);for (n in 1:N) {
y[n] ~ normal(mu_y, sigma_y);
}
}generated quantities {
real variance_y; // derived quantity (transform)
variance_y = sigma_y * sigma_y; }
In this example, y
is an array of modeled data. Although it is specified in the data
block, and thus must have a known value before the program may be run, it is modeled as if it were generated randomly as described by the model.
The variable N
is a typical example of unmodeled data. It is used to indicate a size that is not part of the model itself.
The other variables declared in the data and transformed data block are examples of unmodeled parameters, also known as hyperparameters. Unmodeled parameters are parameters to probability densities that are not themselves modeled probabilistically. In Stan, unmodeled parameters that appear in the data
block may be specified on a per-model execution basis as part of the data read. In the above model, mu_mu
and sigma_mu
are configurable unmodeled parameters.
Unmodeled parameters that are hard coded in the model must be declared in the transformed data
block. For example, the unmodeled parameters alpha
and beta
are both hard coded to the value 0.1. To allow such variables to be configurable based on data supplied to the program at run time, they must be declared in the data
block, like the variables mu_mu
and sigma_mu
.
This program declares two modeled parameters, mu
and tau_y
. These are the location and precision used in the normal model of the values in y
. The heart of the model will be sampling the values of these parameters from their posterior distribution.
The modeled parameter tau_y
is transformed from a precision to a scale parameter and assigned to the variable sigma_y
in the transformed parameters
block. Thus the variable sigma_y
is considered a derived quantity — its value is entirely determined by the values of other variables.
The generated quantities
block defines a value variance_y
, which is defined as a transform of the scale or deviation parameter sigma_y
. It is defined in the generated quantities block because it is not used in the model. Making it a generated quantity allows it to be monitored for convergence (being a non-linear transform, it will have different autocorrelation and hence convergence properties than the deviation itself).
In later versions of Stan which have random number generators for the distributions, the generated quantities
block will be usable to generate replicated data for model checking.
Finally, the variable n
is used as a loop index in the model
block.
Program block: data
The rest of this chapter will lay out the details of each block in order, starting with the data
block in this section.
Variable reads and transformations
The data
block is for the declaration of variables that are read in as data. With the current model executable, each Markov chain of draws will be executed in a different process, and each such process will read the data exactly once.3
Data variables are not transformed in any way. The format for data files or data in memory depends on the interface; see the user’s guides and interface documentation for PyStan, RStan, and CmdStan for details.
Statements
The data
block does not allow statements.
Variable constraint checking
Each variable’s value is validated against its declaration as it is read. For example, if a variable sigma
is declared as real<lower=0>
, then trying to assign it a negative value will raise an error. As a result, data type errors will be caught as early as possible. Similarly, attempts to provide data of the wrong size for a compound data structure will also raise an error.
Program block: transformed data
The transformed data
block is for declaring and defining variables that do not need to be changed when running the program.
Variable reads and transformations
For the transformed data
block, variables are all declared in the variable declarations and defined in the statements. There is no reading from external sources and no transformations performed.
Variables declared in the data
block may be used to declare transformed variables.
Statements
The statements in a transformed data
block are used to define (provide values for) variables declared in the transformed data
block. Assignments are only allowed to variables declared in the transformed data
block.
These statements are executed once, in order, right after the data is read into the data variables. This means they are executed once per chain.
Variables declared in the data
block may be used in statements in the transformed data
block.
Restriction on operations in transformed data
The statements in the transformed data block are designed to be executed once and have a deterministic result. Therefore, log probability is not accumulated and distribution statements may not be used.
Variable constraint checking
Any constraints on variables declared in the transformed data
block are checked after the statements are executed. If any defined variable violates its constraints, Stan will halt with a diagnostic error message.
Program block: parameters
The variables declared in the parameters
program block correspond directly to the variables being sampled by Stan’s samplers (HMC and NUTS). From a user’s perspective, the parameters in the program block are the parameters being sampled by Stan.
Variables declared as parameters cannot be directly assigned values. So there is no block of statements in the parameters
program block. Variable quantities derived from parameters may be declared in the transformed parameters
or generated quantities
blocks, or may be defined as local variables in any statement blocks following their declaration.
There is a substantial amount of computation involved for parameter variables in a Stan program at each leapfrog step within the HMC or NUTS samplers, and a bit more computation along with writes involved for saving the parameter values corresponding to a sample.
Constraining inverse transform
Stan’s two samplers, standard Hamiltonian Monte Carlo (HMC) and the adaptive No-U-Turn sampler (NUTS), are most easily (and often most effectively) implemented over a multivariate probability density that has support on all of \(\mathbb{R}^n\). To do this, the parameters defined in the parameters
block must be transformed so they are unconstrained.
In practice, the samplers keep an unconstrained parameter vector in memory representing the current state of the sampler. The model defined by the compiled Stan program defines an (unnormalized) log probability function over the unconstrained parameters. In order to do this, the log probability function must apply the inverse transform to the unconstrained parameters to calculate the constrained parameters defined in Stan’s parameters
program block. The log Jacobian of the inverse transform is then added to the accumulated log probability function. This then allows the Stan model to be defined in terms of the constrained parameters.
In some cases, the number of parameters is reduced in the unconstrained space. For instance, a \(K\)-simplex only requires \(K-1\) unconstrained parameters, and a \(K\)-correlation matrix only requires \(\binom{K}{2}\) unconstrained parameters. This means that the probability function defined by the compiled Stan program may have fewer parameters than it would appear from looking at the declarations in the parameters
program block.
The probability function on the unconstrained parameters is defined in such a way that the order of the parameters in the vector corresponds to the order of the variables defined in the parameters
program block. The details of the specific transformations are provided in the variable transforms chapter.
Gradient calculation
Hamiltonian Monte Carlo requires the gradient of the (unnormalized) log probability function with respect to the unconstrained parameters to be evaluated during every leapfrog step. There may be one leapfrog step per sample or hundreds, with more being required for models with complex posterior distribution geometries.
Gradients are calculated behind the scenes using Stan’s algorithmic differentiation library. The time to compute the gradient does not depend directly on the number of parameters, only on the number of subexpressions in the calculation of the log probability. This includes the expressions added from the transforms’ Jacobians.
The amount of work done by the sampler does depend on the number of unconstrained parameters, but this is usually dwarfed by the gradient calculations.
Writing draws
In the basic Stan compiled program, there is a file to which the values of variables are written for each draw. The constrained versions of the variables are written in the order they are defined in the parameters
block. In order to do this, the transformed parameter, model, and generated quantities statements must also be executed.
Program block: transformed parameters
The transformed parameters
program block consists of optional variable declarations followed by statements. After the statements are executed, the constraints on the transformed parameters are validated. Any variable declared as a transformed parameter is part of the output produced for draws.
Any variable that is defined wholly in terms of data or transformed data should be declared and defined in the transformed data block. Defining such quantities in the transformed parameters block is legal, but much less efficient than defining them as transformed data.
Constraints are for error checking
Like the constraints on data, the constraints on transformed parameters is meant to catch programming errors as well as convey programmer intent. They are not automatically transformed in such a way as to be satisfied. What will happen if a transformed parameter does not match its constraint is that the current parameter values will be rejected. This can cause Stan’s algorithms to hang or to devolve to random walks. It is not intended to be a way to enforce ad hoc constraints in Stan programs. See the section on reject statements for further discussion of the behavior of reject statements.
Program block: model
The model
program block consists of optional variable declarations followed by statements. The variables in the model block are local variables and are not written as part of the output.
Local variables may not be defined with constraints because there is no well-defined way to have them be both flexible and easy to validate.
The statements in the model block typically define the model. This is the block in which probability (distribution notation) statements are allowed. These are typically used when programming in the BUGS idiom to define the probability model.
Program block: generated quantities
The generated quantities
program block is rather different than the other blocks. Nothing in the generated quantities block affects the sampled parameter values. The block is executed only after a sample has been generated.
Among the applications of posterior inference that can be coded in the generated quantities block are
- forward sampling to generate simulated data for model testing,
- generating predictions for new data,
- calculating posterior event probabilities, including multiple comparisons, sign tests, etc.,
- calculating posterior expectations,
- transforming parameters for reporting,
- applying full Bayesian decision theory,
- calculating log likelihoods, deviances, etc. for model comparison.
Parameter estimates, predictions, statistics, and event probabilities calculated directly using plug-in estimates. Stan automatically provides full Bayesian inference by producing draws from the posterior distribution of any calculated event probabilities, predictions, or statistics.
Within the generated quantities block, the values of all other variables declared in earlier program blocks (other than local variables) are available for use in the generated quantities block.
It is more efficient to define a variable in the generated quantities block instead of the transformed parameters block. Therefore, if a quantity does not play a role in the model, it should be defined in the generated quantities block.
After the generated quantities statements are executed, the constraints on the declared generated quantity variables are validated.
All variables declared as generated quantities are printed as part of the output. Variables declared in nested blocks are local variables, not generated quantities, and thus won’t be printed. For example:
generated quantities {
int a; // added to the output
{int b; // not added to the output
} }
References
Footnotes
If the C++ code is configured for concurrent threads, the data and transformed data blocks can be executed once and reused for multiple chains.↩︎
It is possible to print a variable every iteration that does not depend on parameters—just define it (or redefine it if it is transformed data) in the
generated quantities
block.↩︎With multiple threads, or even running chains sequentially in a single thread, data could be read only once per set of chains. Stan was designed to be thread safe and future versions will provide a multithreading option for Markov chains.↩︎