14.2 Integrator convergence
The integral is performed with the iterative 1D quadrature methods implemented in the Boost library (Agrawal et al. 2017). If the \(n\)th estimate of the integral is denoted \(I_n\) and the \(n\)th estimate of the norm of the integral is denoted \(|I|_n\), the iteration is terminated when
\[ \frac{{|I_{n + 1} - I_n|}}{{|I|_{n + 1}}} < \text{relative tolerance}. \]
The relative_tolerance
parameter can be optionally specified as the
last argument to integrate_1d
. By default, integrate_1d
follows the
Boost library recommendation of setting relative_tolerance
to the square
root of the machine epsilon of double precision floating point numbers
(about 1e-8
).
14.2.1 Zero-crossing integrals
Integrals on the (possibly infinite) interval \((a, b)\) that cross zero are split into two integrals, one from \((a, 0)\) and one from \((0, b)\). This is because the quadrature methods employed internally can have difficulty near zero.
In this case, each integral is separately integrated to the given
relative_tolerance
.
14.2.2 Avoiding precision loss near limits of integration in definite integrals
If care is not taken, the quadrature can suffer from numerical loss of precision near the endpoints of definite integrals.
For instance, in integrating the pdf of a beta distribution when the values of \(\alpha\) and \(\beta\) are small, most of the probability mass is lumped near zero and one.
The pdf of a beta distribution is proportional to
\[ p(x) \propto x^{\alpha - 1}(1 - x)^{\beta - 1} \]
Normalizing this distribution requires computing the integral of \(p(x)\) from zero to one. In Stan code, the integrand might look like:
real beta(real x, real xc, array[] real theta, array[] real x_r, array[] int x_i) {
real alpha = theta[1];
real beta = theta[2];
return x^(alpha - 1.0) * (1.0 - x)^(beta - 1.0);
}
The issue is that there will be numerical breakdown in the precision of
1.0 - x
as x
gets close to one. This is because of the limited
precision of double precision floating numbers. This integral will fail to
converge for values of alpha
and beta
much less than one.
This is where xc
is useful. It is defined, for definite integrals, as a high
precision version of the distance from x
to the nearest endpoint — a - x
or b - x
for a lower endpoint a
and an upper endpoint b
. To make use of
this for the beta integral, the integrand can be re-coded:
real beta(real x, real xc, array[] real theta, array[] real x_r, array[] int x_i) {
real alpha = theta[1];
real beta = theta[2];
real v;
if(x > 0.5) {
1.0) * xc^(beta - 1.0);
v = x^(alpha - else {
} 1.0) * (1.0 - x)^(beta - 1.0);
v = x^(alpha -
}
return v;
}
In this case, as we approach the upper limit of integration \(a = 1\), xc
will
take on the value of \(a - x = 1 - x\). This version of the integrand will
converge for much smaller values of alpha
and beta
than otherwise possible.
Consider another example: let’s say we have a log-normal distribution that is both shifted away from zero by some amount \(\delta\), and truncated at some value \(b\). If we were interested in calculating the expectation of a variable \(X\) distributed in this way, we would need to calculate \[ \int_a^b xf(x)\,dx = \int_{\delta}^b xf(x)\,dx \] in the numerator, where \(f(x)\) is the probability density function for the shifted log-normal distribution. This probability density function can be coded in Stan as:
real shift_lognormal_pdf(real x,
real mu,
real sigma,
real delta) {
real p;
1.0 / ((x - delta) * sigma * sqrt(2 * pi()))) *
p = (1 * (log(x - delta) - mu)^2 / (2 * sigma^2));
exp(-
return p;
}
Therefore, the function that we want to integrate is:
real integrand(real x,
real xc,
array[] real theta,
array[] real x_r,
array[] int x_i) {
real numerator;
real p;
real mu = theta[1];
real sigma = theta[2];
real delta = theta[3];
real b = theta[4];
p = shift_lognormal_pdf(x, mu, sigma, delta);
numerator = x * p;
return numerator;
}
What happens here is that, given that the log-normal distribution is shifted by
\(\delta\), when we then try to integrate the numerator, our x
starts at
values just above delta
. This, in turn, causes the x - delta
term to be
near zero, leading to a breakdown.
We can use xc
, and define the integrand
as:
real integrand(real x,
real xc,
array[] real theta,
array[] real x_r,
array[] int x_i) {
real numerator;
real p;
real mu = theta[1];
real sigma = theta[2];
real delta = theta[3];
real b = theta[4];
if (x < delta + 1) {
p = shift_lognormal_pdf(xc, mu, sigma, delta);else {
}
p = shift_lognormal_pdf(x, mu, sigma, delta);
}
numerator = x * p;
return numerator;
}
Why does this work? When our values of x
are less than delta + 1
(so, when
they’re near delta
, given that our lower bound of integration is equal to
\(\delta\)), we pass xc
as an argument to our shift_lognormal_pdf
function.
This way, instead of dealing with x - delta
in shift_lognormal_pdf
, we are
working with xc - delta
which is equal to delta - x - delta
, as delta
is
the lower endpoint in that case. The delta
terms cancel out, and we are left
with a high-precision version of x
. We don’t encounter the same problem at
the upper limit \(b\) so we don’t adjust the code for that case.
Note, xc
is only used for definite integrals. If either the left endpoint
is at negative infinity or the right endpoint is at positive infinity, xc
will be NaN.
For zero-crossing definite integrals (see section Zero Crossing) the
integrals are broken into two pieces (\((a, 0)\) and \((0, b)\) for endpoints
\(a < 0\) and \(b > 0\)) and xc
is a high precision version of the distance
to the limits of each of the two integrals separately. This means xc
will
be a high precision version of a - x
, x
, or b - x
,
depending on the value of x and the endpoints.