11.1 Algebraic equation solvers
Stan provides two built-in algebraic equation solvers, respectively based on the Newton method and the Powell “dog leg” hybrid method. Empirically the Newton method is found to be faster and its use is recommended for most problems.
An algebraic solver is a higher-order function, i.e. it takes another function as one of its arguments. Other functions in Stan which share this feature are the differential equation solvers (see section Ordinary Differential Equation (ODE) Solvers and Differential Algebraic Equation (DAE) solver). Ordinary Stan functions do not allow functions as arguments.
11.1.1 Specifying an algebraic equation as a function
An algebraic system is specified as an ordinary function in Stan
within the function block. The function must return a vector
and takes in, as its first argument, the unknowns \(y\) we wish to solve for,
also passed as a vector
.
This argument is followed by additional arguments as specified by the user;
we call such arguments variadic arguments and denote them ...
.
The signature of the algebraic system is then:
vector algebra_system (vector y, ...)
There is no type restriction for the variadic arguments
and each argument can be passed as data or parameter.
However users should use parameter arguments only when necessary
and mark data arguments with the keyword data
.
In the below example, the last variadic argument, \(x\), is restricted to being data:
vector algebra_system (vector y, vector theta, data vector x)
Distinguishing data and parameter is important for computational reasons. Augmenting the total number of parameters increases the cost of propagating derivatives through the solution to the algebraic equation, and ultimately the computational cost of evaluating the gradients.
11.1.2 Call to the algebraic solver
vector
solve_newton
(function algebra_system, vector y_guess, ...)
Solves the algebraic system, given an initial guess, using Newton’s method.
Available since 2.31
vector
solve_newton_tol
(function algebra_system, vector y_guess, data real scaling_step, data real f_tol, int max_steps, ...)
Solves the algebraic system, given an initial guess, using Newton’s method with additional control parameters for the solver.
Available since 2.31
vector
solve_powell
(function algebra_system, vector y_guess, ...)
Solves the algebraic system, given an initial guess, using Powell’s hybrid method.
Available since 2.31
vector
solve_powell_tol
(function algebra_system, vector y_guess, data real rel_tol, data real f_tol, int max_steps, ...)
Solves the algebraic system, given an initial guess, using Powell’s hybrid method with additional control parameters for the solver.
Available since 2.31
11.1.2.1 Arguments to the algebraic solver
The arguments to the algebraic solvers are as follows:
algebra_system
: function literal referring to a function specifying the system of algebraic equations with signature(vector, ...):vector
. The arguments represent (1) unknowns, (2) additional parameter and/or data arguments, and the return value contains the value of the algebraic function, which goes to 0 when we plug in the solution to the algebraic system,y_guess
: initial guess for the solution, typevector
,...
: variadic arguments.
The algebraic solvers admit control parameters. While Stan provides default values, the user should be prepared to adjust the control parameters. The following controls are available:
scaling_step
: for the Newton solver only, the scaled-step stopping tolerance, typereal
, data only. If a Newton step is smaller than the scaling step tolerance, the code breaks, assuming the solver is no longer making significant progress. If set to 0, this constraint is ignored. Default value is \(10^{-3}\).rel_tol
: for the Powell solver only, the relative tolerance, typereal
, data only. The relative tolerance is the estimated relative error of the solver and serves to test if a satisfactory solution has been found. Default value is \(10^{-10}\).function_tol
: function tolerance for the algebraic solver, typereal
, data only. After convergence of the solver, the proposed solution is plugged into the algebraic system and its norm is compared to the function tolerance. If the norm is below the function tolerance, the solution is deemed acceptable. Default value is \(10^{-6}\).max_num_steps
: maximum number of steps to take in the algebraic solver, typeint
, data only. If the solver reaches this number of steps, it breaks and returns an error message. Default value is \(200\).
The difference in which control parameters are available has to do with the underlying implementations for the solvers and the control parameters these implementations support. The Newton solver is based on KINSOL from the SUNDIAL suites, while the Powell solver uses a module from the Eigen library.
11.1.2.2 Return value
The return value for the algebraic solver is an object of type
vector
, with values which, when plugged in as y
make the algebraic
function go to 0 (approximately, within the specified function tolerance).
11.1.2.3 Sizes and parallel arrays
Certain sizes have to be consistent. The initial guess, return value of the solver, and return value of the algebraic function must all be the same size.
11.1.2.4 Algorithmic details
Stan offers two methods to solve algebraic equations.
solve_newton
and solve_newton_tol
use the Newton method,
a first-order derivative based numerical solver.
The Stan code builds on the implementation in KINSOL
from the SUNDIALS suite (Hindmarsh et al. 2005).
For many problems, we find that the Newton method is faster
than the Powell method.
If however Newton’s method performs poorly, either failing to or requiring an excessively
long time to converge, the user should be prepared to switch to the Powell method.
solve_powell
and solve_powell_tol
are based on the Powell hybrid method (Powell 1970),
which also uses first-order derivatives. The Stan code builds on
the implementation of the hybrid solver in the unsupported module for
nonlinear optimization problems of the Eigen library (Guennebaud, Jacob, et al. 2010).
This solver is in turn based on the algorithm developed for the
package MINPACK-1 (Jorge J. More 1980).
For both solvers, derivatives are propagated through the solution to the algebraic solution using the implicit function theorem and an adjoint method of automatic differentiation; for a discussion on this topic, see Margossian and Betancourt (2022).