8.5 Program block: parameters
The variables declared in the parameters
program block
correspond directly to the variables being sampled by Stan’s samplers
(HMC and NUTS). From a user’s perspective, the parameters in the
program block are the parameters being sampled by Stan.
Variables declared as parameters cannot be directly assigned values.
So there is no block of statements in the parameters
program
block. Variable quantities derived from parameters may be declared in
the transformed parameters
or generated quantities
blocks,
or may be defined as local variables in any statement blocks following
their declaration.
There is a substantial amount of computation involved for parameter variables in a Stan program at each leapfrog step within the HMC or NUTS samplers, and a bit more computation along with writes involved for saving the parameter values corresponding to a sample.
Constraining inverse transform
Stan’s two samplers, standard Hamiltonian Monte Carlo (HMC) and the
adaptive No-U-Turn sampler (NUTS), are most easily (and often most
effectively) implemented over a multivariate probability density that
has support on all of \(\mathbb{R}^n\). To do this, the parameters
defined in the parameters
block must be transformed so they are
unconstrained.
In practice, the samplers keep an unconstrained parameter vector in
memory representing the current state of the sampler. The model
defined by the compiled Stan program defines an (unnormalized) log
probability function over the unconstrained parameters. In order to
do this, the log probability function must apply the inverse transform
to the unconstrained parameters to calculate the constrained
parameters defined in Stan’s parameters
program block. The
log Jacobian of the inverse transform is then added to the accumulated
log probability function. This then allows the Stan model to be
defined in terms of the constrained parameters.
In some cases, the number of parameters is reduced in the
unconstrained space. For instance, a \(K\)-simplex only requires \(K-1\)
unconstrained parameters, and a \(K\)-correlation matrix only requires
\(\binom{K}{2}\) unconstrained parameters. This means that the
probability function defined by the compiled Stan program may have
fewer parameters than it would appear from looking at the declarations
in the parameters
program block.
The probability function on the unconstrained parameters is defined in
such a way that the order of the parameters in the vector corresponds
to the order of the variables defined in the parameters
program
block. The details of the specific transformations are provided in
the variable transforms chapter.
Gradient calculation
Hamiltonian Monte Carlo requires the gradient of the (unnormalized) log probability function with respect to the unconstrained parameters to be evaluated during every leapfrog step. There may be one leapfrog step per sample or hundreds, with more being required for models with complex posterior distribution geometries.
Gradients are calculated behind the scenes using Stan’s algorithmic differentiation library. The time to compute the gradient does not depend directly on the number of parameters, only on the number of subexpressions in the calculation of the log probability. This includes the expressions added from the transforms’ Jacobians.
The amount of work done by the sampler does depend on the number of unconstrained parameters, but this is usually dwarfed by the gradient calculations.
Writing draws
In the basic Stan compiled program, there is a file to which the
values of variables are written for each draw. The constrained
versions of the variables are written in the order they are
defined in the parameters
block. In order to do this, the
transformed parameter, model, and generated quantities statements must
also be executed.