9.4 Reduce-sum function

This is an old version, view current version.

Stan provides a higher-order reduce function for summation. A function which returns a scalar g: U -> real is mapped to every element of a list of type U[], { x1, x2, ... } and all the results are accumulated,

g(x1) + g(x2) + ...

For efficiency reasons the reduce function doesn’t work with the element-wise evaluated function g itself, but instead works through evaluating partial sums, f: U[] -> real, where:

f({ x1 }) = g(x1)
f({ x1, x2 }) = g(x1) + g(x2)
f({ x1, x2, ... }) = g(x1) + g(x2) + ...

Mathematically the summation reduction is associative and forming arbitrary partial sums in an arbritrary order will not change the result. However, floating point numerics on computers only have a limited precision such that associativity does not hold exactly. This implies that the order of summation determines the exact numerical result. For this reason, the higher-order reduce function is available in two variants:

reduce_sum: Automatically choose partial sums partitioning based on a dynamic scheduling algorithm.
reduce_sum_static: Compute the same sum as reduce_sum, but partition the input in the same way for given data set (in reduce_sum this partitioning might change depending on computer load). This should result in stable numerical evaluations.

9.4.1 Specifying the reduce-sum function

The higher-order reduce function takes a partial sum function f, an array argument x (with one array element for each term in the sum), a recommended grainsize, and a set of shared arguments. This representation allows parallelization of the resultant sum.

real reduce_sum(F f, T[] x, int grainsize, T1 s1, T2 s2, ...)
real reduce_sum_static(F f, T[] x, int grainsize, T1 s1, T2 s2, ...)

Returns the equivalent of f(x, 1, size(x), s1, s2, ...), but computes the result in parallel by breaking the array x into independent partial sums. s1, s2, ... are shared between all terms in the sum.

f: function literal referring to a function specifying the partial sum operation. Refer to the partial sum function.
x: array of T, one for each term of the reduction, T can be any type,
grainsize: For reduce_sum, grainsize is the recommended size of the partial sum (grainsize = 1 means pick totally automatically). For reduce_sum_static, grainsize determines the maximum size of the partial sums, type int,
s1: first (optional) shared argument, type T1, where T1 can be any type
s2: second (optional) shared argument, type T2, where T2 can be any type,
...: remainder of shared arguments, each of which can be any type.

9.4.2 The partial sum function

The partial sum function must have the following signature where the type T, and the types of all the shared arguments (T1, T2, …) match those of the original reduce_sum (reduce_sum_static) call.

(T[] x_subset, int start, int end, T1 s1, T2 s2, ...):real

The partial sum function returns the sum of the start to end terms (inclusive) of the overall calculations. The arguments to the partial sum function are:

x_subset, the subset of x a given partial sum is responsible for computing, type T[], where T matches the type of x in reduce_sum (reduce_sum_static)
start, the index of the first term of the partial sum, type int
end, the index of the last term of the partial sum (inclusive), type int
s1, first shared argument, type T1, matching type of s1 in reduce_sum (reduce_sum_static)
s2, second shared argument, type T2, matching type of s2 in reduce_sum (reduce_sum_static)
..., remainder of shared arguments, with types matching those in reduce_sum (reduce_sum_static)