5.13 Linear Algebra Functions and Solvers

This is an old version, view current version.

5.13 Linear Algebra Functions and Solvers

5.13.1 Matrix Division Operators and Functions

In general, it is much more efficient and also more arithmetically stable to use matrix division than to multiply by an inverse. There are specialized forms for lower triangular matrices and for symmetric, positive-definite matrices.

5.13.1.1 Matrix division operators

row_vector operator/(row_vector b, matrix A)
The right division of b by A; equivalently b * inverse(A)

matrix operator/(matrix B, matrix A)
The right division of B by A; equivalently B * inverse(A)

vector operator\(matrix A, vector b)

matrix operator\(matrix A, matrix B)

5.13.1.2 Lower-triangular matrix division functions

There are four division functions which use lower triangular views of a matrix. The lower triangular view of a matrix \(\text{tri}(A)\) is used in the definitions and defined by \[ \text{tri}(A)[m,n] = \left\{ \begin{array}{ll} A[m,n] & \text{if } m \geq n, \text{ and} \\[4pt] 0 & \text{otherwise}. \end{array} \right. \] When a lower triangular view of a matrix is used, the elements above the diagonal are ignored.

vector mdivide_left_tri_low(matrix A, vector b)
The left division of b by a lower-triangular view of A; algebraically equivalent to the less efficient and stable form inverse(tri(A)) * b, where tri(A) is the lower-triangular portion of A with the above-diagonal entries set to zero.

matrix mdivide_left_tri_low(matrix A, matrix B)
The left division of B by a triangular view of A; algebraically equivalent to the less efficient and stable form inverse(tri(A)) * B, where tri(A) is the lower-triangular portion of A with the above-diagonal entries set to zero.

row_vector mdivide_right_tri_low(row_vector b, matrix A)
The right division of b by a triangular view of A; algebraically equivalent to the less efficient and stable form b * inverse(tri(A)), where tri(A) is the lower-triangular portion of A with the above-diagonal entries set to zero.

matrix mdivide_right_tri_low(matrix B, matrix A)
The right division of B by a triangular view of A; algebraically equivalent to the less efficient and stable form B * inverse(tri(A)), where tri(A) is the lower-triangular portion of A with the above-diagonal entries set to zero.

5.13.2 Symmetric positive-definite matrix division functions

There are four division functions which are specialized for efficiency and stability for symmetric positive-definite matrix dividends. If the matrix dividend argument is not symmetric and positive definite, these will reject and print warnings.

matrix mdivide_left_spd(matrix A, vector b)
The left division of b by the symmetric, positive-definite matrix A; algebraically equivalent to the less efficient and stable form inverse(A) * b.

vector mdivide_left_spd(matrix A, matrix B)
The left division of B by the symmetric, positive-definite matrix A; algebraically equivalent to the less efficient and stable form inverse(A) * B.

row_vector mdivide_right_spd(row_vector b, matrix A)
The right division of b by the symmetric, positive-definite matrix A; algebraically equivalent to the less efficient and stable form b * inverse(A).

matrix mdivide_right_spd(matrix B, matrix A)
The right division of B by the symmetric, positive-definite matrix A; algebraically equivalent to the less efficient and stable form B * inverse(A).

5.13.3 Matrix Exponential

The exponential of the matrix \(A\) is formally defined by the convergent power series: \[ e^A = \sum_{n=0}^{\infty} \dfrac{A^n}{n!} \]

matrix matrix_exp(matrix A)
The matrix exponential of A

matrix matrix_exp_multiply(matrix A, matrix B)
The multiplication of matrix exponential of A and matrix B; algebraically equivalent to the less efficient form matrix_exp(t * A) * B.

matrix scale_matrix_exp_multiply(real t, matrix A, matrix B)
The multiplication of matrix exponential of tA and matrix B; algebraically equivalent to the less efficient form matrix_exp(t * A) * B.

5.13.4 Linear Algebra Functions

5.13.4.1 Trace

real trace(matrix A)
The trace of A, or 0 if A is empty; A is not required to be diagonal

5.13.4.2 Determinants

real determinant(matrix A)
The determinant of A

real log_determinant(matrix A)
The log of the absolute value of the determinant of A

5.13.4.3 Inverses

It is almost never a good idea to use matrix inverses directly because they are both inefficient and arithmetically unstable compared to the alternatives. Rather than inverting a matrix m and post-multiplying by a vector or matrix a, as in inverse(m) * a, it is better to code this using matrix division, as in m \ a. The pre-multiplication case is similar, with b * inverse(m) being more efficiently coded as as b / m. There are also useful special cases for triangular and symmetric, positive-definite matrices that use more efficient solvers.

Warning: The function inv(m) is the elementwise inverse function, which returns 1 / m[i, j] for each element.

matrix inverse(matrix A)
The inverse of A

matrix inverse_spd(matrix A)
The inverse of A where A is symmetric, positive definite. This version is faster and more arithmetically stable when the input is symmetric and positive definite.

5.13.4.4 Eigendecomposition

vector eigenvalues_sym(matrix A)
The vector of eigenvalues of a symmetric matrix A in ascending order

matrix eigenvectors_sym(matrix A)
The matrix with the (column) eigenvectors of symmetric matrix A in the same order as returned by the function eigenvalues_sym

Because multiplying an eigenvector by \(-1\) results in an eigenvector, eigenvectors returned by a decomposition are only identified up to a sign change. In order to compare the eigenvectors produced by Stan’s eigendecomposition to others, signs may need to be normalized in some way, such as by fixing the sign of a component, or doing comparisons allowing a multiplication by \(-1\).

The condition number of a symmetric matrix is defined to be the ratio of the largest eigenvalue to the smallest eigenvalue. Large condition numbers lead to difficulty in numerical algorithms such as computing inverses, and thus known as “ill conditioned.” The ratio can even be infinite in the case of singular matrices (i.e., those with eigenvalues of 0).

5.13.4.5 QR Decomposition

matrix qr_thin_Q(matrix A)
The orthogonal matrix in the thin QR decomposition of A, which implies that the resulting matrix has the same dimensions as A

matrix qr_thin_R(matrix A)
The upper triangular matrix in the thin QR decomposition of A, which implies that the resulting matrix is square with the same number of columns as A

matrix qr_Q(matrix A)
The orthogonal matrix in the fat QR decomposition of A, which implies that the resulting matrix is square with the same number of rows as A

matrix qr_R(matrix A)
The upper trapezoidal matrix in the fat QR decomposition of A, which implies that the resulting matrix will be rectangular with the same dimensions as A

The thin QR decomposition is always preferable because it will consume much less memory when the input matrix is large than will the fat QR decomposition. Both versions of the decomposition represent the input matrix as \[ A = Q \, R. \] Multiplying a column of an orthogonal matrix by \(-1\) still results in an orthogonal matrix, and you can multiply the corresponding row of the upper trapezoidal matrix by \(-1\) without changing the product. Thus, Stan adopts the normalization that the diagonal elements of the upper trapezoidal matrix are strictly positive and the columns of the orthogonal matrix are reflected if necessary. Also, these QR decomposition algorithms do not utilize pivoting and thus may be numerically unstable on input matrices that have less than full rank.

5.13.4.6 Cholesky Decomposition

Every symmetric, positive-definite matrix (such as a correlation or covariance matrix) has a Cholesky decomposition. If \(\Sigma\) is a symmetric, positive-definite matrix, its Cholesky decomposition is the lower-triangular vector \(L\) such that \[ \Sigma = L \, L^{\top}. \]

matrix cholesky_decompose(matrix A)
The lower-triangular Cholesky factor of the symmetric positive-definite matrix A

5.13.4.7 Singular Value Decomposition

Stan only provides functions for the singular values, not for the singular vectors involved in a singular value decomposition (SVD).

vector singular_values(matrix A)
The singular values of A in descending order