4.1 Reductions
The following operations take arrays as input and produce single output values. The boundary values for size 0 arrays are the unit with respect to the combination operation (min, max, sum, or product).
4.1.1 Minimum and maximum
real
min
(real[] x)
The minimum value in x, or \(+\infty\) if x is size 0.
int
min
(int[] x)
The minimum value in x, or error if x is size 0.
real
max
(real[] x)
The maximum value in x, or \(-\infty\) if x is size 0.
int
max
(int[] x)
The maximum value in x, or error if x is size 0.
4.1.2 Sum, product, and log sum of exp
int
sum
(int[] x)
The sum of the elements in x, defined for \(x\) of size \(N\) by \[
\text{sum}(x) = \begin{cases} \sum_{n=1}^N x_n & \text{if} N > 0
\\[4pt] 0 & \text{if} N = 0 \end{cases} \]
real
sum
(real[] x)
The sum of the elements in x; see definition above.
real
prod
(real[] x)
The product of the elements in x, or 1 if x is size 0.
real
prod
(int[] x)
The product of the elements in x, \[ \text{product}(x) = \begin{cases}
\prod_{n=1}^N x_n & \text{if} N > 0 \\[4pt] 1 & \text{if} N = 0
\end{cases} \]
real
log_sum_exp
(real[] x)
The natural logarithm of the sum of the exponentials of the elements
in x, or \(-\infty\) if the array is empty.
4.1.3 Sample mean, variance, and standard deviation
The sample mean, variance, and standard deviation are calculated in the usual way. For i.i.d. draws from a distribution of finite mean, the sample mean is an unbiased estimate of the mean of the distribution. Similarly, for i.i.d. draws from a distribution of finite variance, the sample variance is an unbiased estimate of the variance.2 The sample deviation is defined as the square root of the sample deviation, but is not unbiased.
real
mean
(real[] x)
The sample mean of the elements in x. For an array \(x\) of size \(N > 0\), \[ \text{mean}(x) \ = \ \bar{x} \ = \ \frac{1}{N} \sum_{n=1}^N
x_n. \] It is an error to the call the mean function with an array of
size \(0\).
real
variance
(real[] x)
The sample variance of the elements in x. For \(N > 0\), \[
\text{variance}(x) \ = \ \begin{cases} \frac{1}{N-1} \sum_{n=1}^N (x_n
- \bar{x})^2 & \text{if } N > 1 \\[4pt] 0 & \text{if } N = 1
\end{cases} \] It is an error to call the variance
function with an
array of size 0.
real
sd
(real[] x)
The sample standard deviation of elements in x. \[ \text{sd}(x) =
\begin{cases} \sqrt{\, \text{variance}(x)} & \text{if } N > 1 \\[4pt]
0 & \text{if } N = 0 \end{cases} \] It is an error to call the sd
function with an array of size 0.
4.1.4 Euclidean distance and squared distance
real
distance
(vector x, vector y)
The Euclidean distance between x and y, defined by \[
\text{distance}(x,y) \ = \ \sqrt{\textstyle \sum_{n=1}^N (x_n -
y_n)^2} \] where N
is the size of x and y. It is an error to call
distance
with arguments of unequal size.
real
distance
(vector x, row_vector y)
The Euclidean distance between x and y
real
distance
(row_vector x, vector y)
The Euclidean distance between x and y
real
distance
(row_vector x, row_vector y)
The Euclidean distance between x and y
real
squared_distance
(vector x, vector y)
The squared Euclidean distance between x and y, defined by \[
\mathrm{squared\_distance}(x,y) \ = \ \text{distance}(x,y)^2 \ = \
\textstyle \sum_{n=1}^N (x_n - y_n)^2, \] where N
is the size of x
and y. It is an error to call squared_distance
with arguments of
unequal size.
real
squared_distance
(vector x, row_vector y)
The squared Euclidean distance between x and y
real
squared_distance
(row_vector x, vector y)
The squared Euclidean distance between x and y
real
squared_distance
(row_vector x, row_vector y)
The Euclidean distance between x and y
4.1.5 Quantile
Produces sample quantiles corresponding to the given probabilities. The smallest observation corresponds to a probability of 0 and the largest to a probability of 1.
Implements algorithm 7 from Hyndman, R. J. and Fan, Y., Sample quantiles in Statistical Packages (R’s default quantile function).
real
quantile
(data real[] x, data real p)
The p-th quantile of x
real[]
quantile
(data real[] x, data real p[])
An array containing the quantiles of x given by the array of probabilities p
Dividing by \(N\) rather than \((N-1)\) produces a maximum likelihood estimate of variance, which is biased to underestimate variance.↩︎