5.4 Vector and matrix data types

This is an old version, view current version.

Stan provides three types of container objects: arrays, vectors, and matrices. Vectors and matrices are more limited kinds of data structures than arrays. Vectors are intrinsically one-dimensional collections of reals, whereas matrices are intrinsically two dimensional. Vectors, matrices, and arrays are not assignable to one another, even if their dimensions are identical. A \(3 \times 4\) matrix is a different kind of object in Stan than a \(3 \times 4\) array.

The intention of using matrix types is to call out their usage in the code. There are three situations in Stan where only vectors and matrices may be used,

matrix arithmetic operations (e.g., matrix multiplication)
linear algebra functions (e.g., eigenvalues and determinants), and
multivariate function parameters and outcomes (e.g., multivariate normal distribution arguments).

Vectors and matrices cannot be typed to return integer values. They are restricted to real values.²

For constructing vectors and matrices in Stan, see Vector, Matrix, and Array Expressions.

Indexing from 1

Vectors and matrices, as well as arrays, are indexed starting from one in Stan. This follows the convention in statistics and linear algebra as well as their implementations in the statistical software packages R, MATLAB, BUGS, and JAGS. General computer programming languages, on the other hand, such as C++ and Python, index arrays starting from zero.

Vectors

Vectors in Stan are column vectors; see the next subsection for information on row vectors. Vectors are declared with a size (i.e., a dimensionality). For example, a 3-dimensional vector is declared with the keyword vector, as follows.

vector[3] u;

Vectors may also be declared with constraints, as in the following declaration of a 3-vector of non-negative values.

vector<lower=0>[3] u;

Similarly, they may be declared with a offset and/or multiplier, as in the following example

vector<offset=42,multiplier=3>[3] u;

Unit simplexes

A unit simplex is a vector with non-negative values whose entries sum to 1. For instance, \([0.2,0.3,0.4,0.1]^{\top}\) is a unit 4-simplex. Unit simplexes are most often used as parameters in categorical or multinomial distributions, and they are also the sampled variate in a Dirichlet distribution. Simplexes are declared with their full dimensionality. For instance, theta is declared to be a unit \(5\)-simplex by

simplex[5] theta;

Unit simplexes are implemented as vectors and may be assigned to other vectors and vice-versa. Simplex variables, like other constrained variables, are validated to ensure they contain simplex values; for simplexes, this is only done up to a statically specified accuracy threshold \(\epsilon\) to account for errors arising from floating-point imprecision.

In high dimensional problems, simplexes may require smaller step sizes in the inference algorithms in order to remain stable; this can be achieved through higher target acceptance rates for samplers and longer warmup periods, tighter tolerances for optimization with more iterations, and in either case, with less dispersed parameter initialization or custom initialization if there are informative priors for some parameters.

Unit vectors

A unit vector is a vector with a norm of one. For instance, \([0.5, 0.5, 0.5, 0.5]^{\top}\) is a unit 4-vector. Unit vectors are sometimes used in directional statistics. Unit vectors are declared with their full dimensionality. For instance, theta is declared to be a unit \(5\)-vector by

unit_vector[5] theta;

Unit vectors are implemented as vectors and may be assigned to other vectors and vice-versa. Unit vector variables, like other constrained variables, are validated to ensure that they are indeed unit length; for unit vectors, this is only done up to a statically specified accuracy threshold \(\epsilon\) to account for errors arising from floating-point imprecision.

Ordered vectors

An ordered vector type in Stan represents a vector whose entries are sorted in ascending order. For instance, \((-1.3,2.7,2.71)^{\top}\) is an ordered 3-vector. Ordered vectors are most often employed as cut points in ordered logistic regression models (see section).

The variable c is declared as an ordered 5-vector by

ordered[5] c;

After their declaration, ordered vectors, like unit simplexes, may be assigned to other vectors and other vectors may be assigned to them. Constraints will be checked after executing the block in which the variables were declared.

Positive, ordered vectors

There is also a positive, ordered vector type which operates similarly to ordered vectors, but all entries are constrained to be positive. For instance, \((2,3.7,4,12.9)\) is a positive, ordered 4-vector.

The variable d is declared as a positive, ordered 5-vector by

positive_ordered[5] d;

Like ordered vectors, after their declaration, positive ordered vectors may be assigned to other vectors and other vectors may be assigned to them. Constraints will be checked after executing the block in which the variables were declared.

Row vectors

Row vectors are declared with the keyword row_vector. Like (column) vectors, they are declared with a size. For example, a 1093-dimensional row vector u would be declared as

row_vector[1093] u;

Constraints are declared as for vectors, as in the following example of a 10-vector with values between -1 and 1.

row_vector<lower=-1,upper=1>[10] u;

Offset and multiplier are also similar as for the following 3-row-vector with offset -42 and multiplier 3.

row_vector<offset=-42,multiplier=3>[3] u;

Row vectors may not be assigned to column vectors, nor may column vectors be assigned to row vectors. If assignments are required, they may be accommodated through the transposition operator.

Matrices

Matrices are declared with the keyword matrix along with a number of rows and number of columns. For example,

matrix[3, 3] A;
matrix[M, N] B;

declares A to be a \(3 \times 3\) matrix and B to be a \(M \times N\) matrix. For the second declaration to be well formed, the variables M and N must be declared as integers in either the data or transformed data block and before the matrix declaration.

Matrices may also be declared with constraints, as in this (\(3 \times 4\)) matrix of non-positive values.

matrix<upper=0>[3, 4] B;

Similarly, matrices can be declared to have a set offset and/or multiplier, as in this matrix with multiplier 5.

matrix<multiplier=5>[3, 4] B;

Assigning to rows of a matrix

Rows of a matrix can be assigned by indexing the left-hand side of an assignment statement. For example, this is possible.

matrix[M, N] a;
row_vector[N] b;
// ...
a[1] = b;

This copies the values from row vector b to a[1], which is the first row of the matrix a. If the number of columns in a is not the same as the size of b, a run-time error is raised; the number of columns of a is N, which is also the number of columns of b.

Assignment works by copying values in Stan. That means any subsequent assignment to a[1] does not affect b, nor does an assignment to b affect a.

Covariance matrices

Matrix variables may be constrained to represent covariance matrices. A matrix is a covariance matrix if it is symmetric and positive definite. Like correlation matrices, covariance matrices only need a single dimension in their declaration. For instance,

cov_matrix[K] Omega;

declares Omega to be a \(K \times K\) covariance matrix, where \(K\) is the value of the data variable K.

Correlation matrices

Matrix variables may be constrained to represent correlation matrices. A matrix is a correlation matrix if it is symmetric and positive definite, has entries between \(-1\) and \(1\), and has a unit diagonal. Because correlation matrices are square, only one dimension needs to be declared. For example,

corr_matrix[3] Sigma;

declares Sigma to be a \(3 \times 3\) correlation matrix.

Correlation matrices may be assigned to other matrices, including unconstrained matrices, if their dimensions match, and vice-versa.

Cholesky factors of covariance matrices

Matrix variables may be constrained to represent the Cholesky factors of a covariance matrix. This is often more convenient or more efficient than representing covariance matrices directly.

A Cholesky factor \(L\) is an \(M \times N\) lower-triangular matrix (if \(m < n\) then \(L[m, n] =0\)) with a strictly positive diagonal (\(L[k, k] > 0\)) and \(M \geq N\). If \(L\) is a Cholesky factor, then \(\Sigma = L \, L^{\top}\) is a covariance matrix (i.e., it is positive definite). The mapping between positive definite matrices and their Cholesky factors is bijective—every covariance matrix has a unique Cholesky factorization.

The typical case of a square Cholesky factor may be declared with a single dimension,

cholesky_factor_cov[4] L;

Cholesky factors of positive semi-definite matrices

In general, two dimensions may be declared, with the above being equal to cholesky_factor_cov[4, 4]. The type cholesky_factor_cov[M, N] may be used for the general \(M \times N\) case to produce positive semi-definite matrices of rank \(M\).

Cholesky factors of correlation matrices

Matrix variables may be constrained to represent the Cholesky factors of a correlation matrix.

A Cholesky factor for a correlation matrix \(L\) is a \(K \times K\) lower-triangular matrix with positive diagonal entries and rows that are of length 1 (i.e., \(\sum_{n=1}^K L_{m,n}^2 = 1\)). If \(L\) is a Cholesky factor for a correlation matrix, then \(L\,L^{\top}\) is a correlation matrix (i.e., symmetric positive definite with a unit diagonal).

To declare the variable L to be a K by K Cholesky factor of a correlation matrix, the following code may be used.

cholesky_factor_corr[K] L;

Assigning constrained variables

Constrained variables of all types may be assigned to other variables of the same unconstrained type and vice-versa. Matching is interpreted strictly as having the same basic type and number of array dimensions. Constraints are not considered, but basic data types are. For instance, a variable declared to be real<lower=0,upper=1> could be assigned to a variable declared as real and vice-versa. Similarly, a variable declared as matrix[3, 3] may be assigned to a variable declared as cov_matrix[3] or cholesky_factor_cov[3], and vice-versa.

Checks are carried out at the end of each relevant block of statements to ensure constraints are enforced. This includes run-time size checks. The Stan compiler isn’t able to catch the fact that an attempt may be made to assign a matrix of one dimensionality to a matrix of mismatching dimensionality.

Expressions as size declarations

Variables may be declared with sizes given by expressions. Such expressions are constrained to only contain data or transformed data variables. This ensures that all sizes are determined once the data is read in and transformed data variables defined by their statements. For example, the following is legal.

data {
  int<lower=0> N_observed;    int<lower=0> N_missing;
  // ...
transformed parameters {
  vector[N_observed + N_missing] y;
  // ...

Accessing vector and matrix elements

If v is a column vector or row vector, then v[2] is the second element in the vector. If m is a matrix, then m[2, 3] is the value in the second row and third column.

Providing a matrix with a single index returns the specified row. For instance, if m is a matrix, then m[2] is the second row. This allows Stan blocks such as

matrix[M, N] m;
row_vector[N] v;
real x;
// ...
v = m[2];
x = v[3];   // x == m[2][3] == m[2, 3]

The type of m[2] is row_vector because it is the second row of m. Thus it is possible to write m[2][3] instead of m[2, 3] to access the third element in the second row. When given a choice, the form m[2, 3] is preferred.

Array index style

The form m[2, 3] is more efficient because it does not require the creation and use of an intermediate expression template for m[2]. In later versions, explicit calls to m[2][3] may be optimized to be as efficient as m[2, 3] by the Stan compiler.

Size declaration restrictions

An integer expression is used to pick out the sizes of vectors, matrices, and arrays. For instance, we can declare a vector of size M + N using

vector[M + N] y;

Any integer-denoting expression may be used for the size declaration, providing all variables involved are either data, transformed data, or local variables. That is, expressions used for size declarations may not include parameters or transformed parameters or generated quantities.

This may change if Stan is called upon to do complicated integer matrix operations or boolean matrix operations. Integers are not appropriate inputs for linear algebra functions.↩