16.3 Data Type and Indexing Efficiency
The underlying matrix and linear algebra operations are implemented in terms of data types from the Eigen C++ library. By having vectors and matrices as basic types, no conversion is necessary when invoking matrix operations or calling linear algebra functions.
Arrays, on the other hand, are implemented as instances of the C++
std::vector
class (not to be confused with Eigen’s Eigen::Vector
class or Stan vectors). By implementing arrays this way, indexing is efficient because values can be returned by reference rather than copied by value.
Matrices vs. Two-Dimensional Arrays
In Stan models, there are a few minor efficiency considerations in deciding between a two-dimensional array and a matrix, which may seem interchangeable at first glance.
First, matrices use a bit less memory than two-dimensional arrays. This is because they don’t store a sequence of arrays, but just the data and the two dimensions.
Second, matrices store their data in column-major order. Furthermore, all of the data in a matrix is guaranteed to be contiguous in memory. This is an important consideration for optimized code because bringing in data from memory to cache is much more expensive than performing arithmetic operations with contemporary CPUs. Arrays, on the other hand, only guarantee that the values of primitive types are contiguous in memory; otherwise, they hold copies of their values (which are returned by reference wherever possible).
Third, both data structures are best traversed in the order in which they are stored. This also helps with memory locality. This is column-major for matrices, so the following order is appropriate.
matrix[M, N] a;
//...
for (n in 1:N)
for (m in 1:M)
// ... do something with a[m, n] ...
Arrays, on the other hand, should be traversed in row-major order (i.e., last index fastest), as in the following example.
real a[M, N];
// ...
for (m in 1:M)
for (n in 1:N)
// ... do something with a[m, n] ...
The first use of a[m,n]
should bring a[m]
into memory. Overall, traversing matrices is more efficient than traversing arrays.
This is true even for arrays of matrices. For example, the ideal order in which to traverse a two-dimensional array of matrices is
matrix[M, N] b[I, J];
// ...
for (i in 1:I)
for (j in 1:J)
for (n in 1:N)
for (m in 1:M)
... do something with b[i, j, m, n] ...
If a
is a matrix, the notation a[m]
picks out row m
of that matrix. This is a rather inefficient operation for matrices. If indexing of vectors is needed, it is much better to declare an array of vectors. That is, this
row_vector[N] b[M];
// ...
for (m in 1:M)
... do something with row vector b[m] ...
is much more efficient than the pure matrix version
matrix b[M, N];
// ...
for (m in 1:M)
// ... do something with row vector b[m] ...
Similarly, indexing an array of column vectors is more efficient than using the col
function to pick out a column of a matrix.
In contrast, whatever can be done as pure matrix algebra will be the fastest. So if I want to create a row of predictor-coefficient dot-products, it’s more efficient to do this
matrix[N, k] x; // predictors (aka covariates)
// ...
vector[K] beta; // coeffs
// ...
vector[N] y_hat; // linear prediction
// ...
y_hat = x * beta;
than it is to do this
row_vector[K] x[N]; // predictors (aka covariates)
// ...
vector[K] beta; // coeffs
...
vector[N] y_hat; // linear prediction
...
for (n in 1:N)
y_hat[n] = x[n] * beta;
(Row) Vectors vs. One-Dimensional Arrays
For use purely as a container, there is really nothing to decide among vectors, row vectors and one-dimensional arrays. The Eigen::Vector
template specialization and the std::vector
template class are implemented similarly as containers of double
values (the type real
in Stan). Only arrays in Stan are allowed to store integer values.