17.4 Writing Models for Optimization
Constrained vs. Unconstrained Parameters
For constrained optimization problems, for instance, with a standard
deviation parameter \(\sigma\) constrained so that \(\sigma > 0\), it can
be much more efficient to declare a parameter sigma
with no
constraints. This allows the optimizer to easily get close to 0
without having to tend toward \(-\infty\) on the \(\log \sigma\) scale.
The Jacobian adjustment is not an issue for posterior modes, because Stan turns off the built-in Jacobian adjustments for optimization.
With unconstrained parameterizations of parameters with constrained support, it is important to provide a custom initialization that is within the support. For example, declaring a vector
vector[M] sigma;
and using the default random initialization which is \(\mathsf{Uniform}(-2, 2)\) on the unconstrained scale means that there is only a \(2^{-M}\) chance that the initialization will be within support.
For any given optimization problem, it is probably worthwhile trying the program both ways, with and without the constraint, to see which one is more efficient.