8.5 Program Block: parameters
The variables declared in the parameters
program block correspond directly to the variables being sampled by Stan’s samplers (HMC and NUTS). From a user’s perspective, the parameters in the program block are the parameters being sampled by Stan.
Variables declared as parameters cannot be directly assigned values. So there is no block of statements in the parameters
program block. Variable quantities derived from parameters may be declared in the transformed parameters
or generated quantities
blocks, or may be defined as local variables in any statement blocks following their declaration.
There is a substantial amount of computation involved for parameter variables in a Stan program at each leapfrog step within the HMC or NUTS samplers, and a bit more computation along with writes involved for saving the parameter values corresponding to a sample.
Constraining Inverse Transform
Stan’s two samplers, standard Hamiltonian Monte Carlo (HMC) and the adaptive No-U-Turn sampler (NUTS), are most easily (and often most effectively) implemented over a multivariate probability density that has support on all of \(\mathbb{R}^n\). To do this, the parameters defined in the parameters
block must be transformed so they are unconstrained.
In practice, the samplers keep an unconstrained parameter vector in memory representing the current state of the sampler. The model defined by the compiled Stan program defines an (unnormalized) log probability function over the unconstrained parameters. In order to do this, the log probability function must apply the inverse transform to the unconstrained parameters to calculate the constrained parameters defined in Stan’s parameters
program block. The log Jacobian of the inverse transform is then added to the accumulated log probability function. This then allows the Stan model to be defined in terms of the constrained parameters.
In some cases, the number of parameters is reduced in the unconstrained space. For instance, a \(K\)-simplex only requires \(K-1\) unconstrained parameters, and a \(K\)-correlation matrix only requires \(\binom{K}{2}\) unconstrained parameters. This means that the probability function defined by the compiled Stan program may have fewer parameters than it would appear from looking at the declarations in the parameters
program block.
The probability function on the unconstrained parameters is defined in such a way that the order of the parameters in the vector corresponds to the order of the variables defined in the parameters
program block. The details of the specific transformations are provided in the variable transforms chapter.
Gradient Calculation
Hamiltonian Monte Carlo requires the gradient of the (unnormalized) log probability function with respect to the unconstrained parameters to be evaluated during every leapfrog step. There may be one leapfrog step per sample or hundreds, with more being required for models with complex posterior distribution geometries.
Gradients are calculated behind the scenes using Stan’s algorithmic differentiation library. The time to compute the gradient does not depend directly on the number of parameters, only on the number of subexpressions in the calculation of the log probability. This includes the expressions added from the transforms’ Jacobians.
The amount of work done by the sampler does depend on the number of unconstrained parameters, but this is usually dwarfed by the gradient calculations.
Writing Draws
In the basic Stan compiled program, there is a file to which the values of variables are written for each draw. The constrained versions of the variables are written in the order they are defined in the parameters
block. In order to do this, the transformed parameter, model, and generated quantities statements must also be executed.