## 6.14 Linear algebra functions and solvers

### 6.14.1 Matrix division operators and functions

In general, it is much more efficient and also more arithmetically stable to use matrix division than to multiply by an inverse. There are specialized forms for lower triangular matrices and for symmetric, positive-definite matrices.

#### 6.14.1.1 Matrix division operators

`row_vector`

`operator/`

`(row_vector b, matrix A)`

The right division of b by A; equivalently `b * inverse(A)`

*Available since 2.0*

`matrix`

`operator/`

`(matrix B, matrix A)`

The right division of B by A; equivalently `B * inverse(A)`

*Available since 2.5*

`vector`

`operator\`

`(matrix A, vector b)`

The left division of A by b; equivalently `inverse(A) * b`

*Available since 2.18*

`matrix`

`operator\`

`(matrix A, matrix B)`

The left division of A by B; equivalently `inverse(A) * B`

*Available since 2.18*

#### 6.14.1.2 Lower-triangular matrix division functions

There are four division functions which use lower triangular views of a matrix. The lower triangular view of a matrix \(\text{tri}(A)\) is used in the definitions and defined by \[ \text{tri}(A)[m,n] = \left\{ \begin{array}{ll} A[m,n] & \text{if } m \geq n, \text{ and} \\[4pt] 0 & \text{otherwise}. \end{array} \right. \] When a lower triangular view of a matrix is used, the elements above the diagonal are ignored.

`vector`

`mdivide_left_tri_low`

`(matrix A, vector b)`

The left division of b by a lower-triangular view of A; algebraically
equivalent to the less efficient and stable form `inverse(tri(A)) * b`

, where `tri(A)`

is the lower-triangular portion of A with the
above-diagonal entries set to zero.
*Available since 2.12*

`matrix`

`mdivide_left_tri_low`

`(matrix A, matrix B)`

The left division of B by a triangular view of A; algebraically
equivalent to the less efficient and stable form `inverse(tri(A)) * B`

, where `tri(A)`

is the lower-triangular portion of A with the
above-diagonal entries set to zero.
*Available since 2.5*

`row_vector`

`mdivide_right_tri_low`

`(row_vector b, matrix A)`

The right division of b by a triangular view of A; algebraically
equivalent to the less efficient and stable form `b * inverse(tri(A))`

, where `tri(A)`

is the lower-triangular portion of A
with the above-diagonal entries set to zero.
*Available since 2.12*

`matrix`

`mdivide_right_tri_low`

`(matrix B, matrix A)`

The right division of B by a triangular view of A; algebraically
equivalent to the less efficient and stable form `B * inverse(tri(A))`

, where `tri(A)`

is the lower-triangular portion of A
with the above-diagonal entries set to zero.
*Available since 2.5*

### 6.14.2 Symmetric positive-definite matrix division functions

There are four division functions which are specialized for efficiency and stability for symmetric positive-definite matrix dividends. If the matrix dividend argument is not symmetric and positive definite, these will reject and print warnings.

`matrix`

`mdivide_left_spd`

`(matrix A, vector b)`

The left division of b by the symmetric, positive-definite matrix A;
algebraically equivalent to the less efficient and stable form
`inverse(A) * b`

.
*Available since 2.12*

`vector`

`mdivide_left_spd`

`(matrix A, matrix B)`

The left division of B by the symmetric, positive-definite matrix A;
algebraically equivalent to the less efficient and stable form
`inverse(A) * B`

.
*Available since 2.12*

`row_vector`

`mdivide_right_spd`

`(row_vector b, matrix A)`

The right division of b by the symmetric, positive-definite matrix A;
algebraically equivalent to the less efficient and stable form
`b *inverse(A)`

.
*Available since 2.12*

`matrix`

`mdivide_right_spd`

`(matrix B, matrix A)`

The right division of B by the symmetric, positive-definite matrix A;
algebraically equivalent to the less efficient and stable form
`B * inverse(A)`

.
*Available since 2.12*

### 6.14.3 Matrix exponential

The exponential of the matrix \(A\) is formally defined by the convergent power series: \[ e^A = \sum_{n=0}^{\infty} \dfrac{A^n}{n!} \]

`matrix`

`matrix_exp`

`(matrix A)`

The matrix exponential of A
*Available since 2.13*

`matrix`

`matrix_exp_multiply`

`(matrix A, matrix B)`

The multiplication of matrix exponential of A and matrix B;
algebraically equivalent to the less efficient form `matrix_exp(A) * B`

.
*Available since 2.18*

`matrix`

`scale_matrix_exp_multiply`

`(real t, matrix A, matrix B)`

The multiplication of matrix exponential of tA and matrix B;
algebraically equivalent to the less efficient form `matrix_exp(t * A) * B`

.
*Available since 2.18*

### 6.14.4 Matrix power

Returns the nth power of the specific matrix: \[ M^n = M_1 * ... * M_n \]

`matrix`

`matrix_power`

`(matrix A, int B)`

Matrix A raised to the power B.
*Available since 2.24*

### 6.14.5 Linear algebra functions

#### 6.14.5.1 Trace

`real`

`trace`

`(matrix A)`

The trace of A, or 0 if A is empty; A is not required to be diagonal
*Available since 2.0*

#### 6.14.5.2 Determinants

`real`

`determinant`

`(matrix A)`

The determinant of A
*Available since 2.0*

`real`

`log_determinant`

`(matrix A)`

The log of the absolute value of the determinant of A
*Available since 2.0*

#### 6.14.5.3 Inverses

It is almost never a good idea to use matrix inverses directly because
they are both inefficient and arithmetically unstable compared to the
alternatives. Rather than inverting a matrix `m`

and post-multiplying
by a vector or matrix `a`

, as in `inverse(m) * a`

, it is better to
code this using matrix division, as in `m \ a`

. The
pre-multiplication case is similar, with `b * inverse(m)`

being more
efficiently coded as as `b / m`

. There are also useful special cases
for triangular and symmetric, positive-definite matrices that use more
efficient solvers.

* Warning:* The function

`inv(m)`

is the elementwise inverse
function, which returns `1 / m[i, j]`

for each element.`matrix`

`inverse`

`(matrix A)`

Compute the inverse of A
*Available since 2.0*

`matrix`

`inverse_spd`

`(matrix A)`

Compute the inverse of A where A is symmetric, positive definite. This version
is faster and more arithmetically stable when the input is symmetric
and positive definite.
*Available since 2.0*

`matrix`

`chol2inv`

`(matrix L)`

Compute the inverse of the matrix whose cholesky factorization is L.
That is, for \(A = L L^T\), return \(A^{-1}\).
*Available since 2.26*

#### 6.14.5.4 Generalized Inverse

The generalized inverse \(M^+\) of a matrix \(M\) is a matrix that satisfies \(M M^+ M = M\). For an invertible, square matrix \(M\), \(M^+\) is equivalent to \(M^{-1}\). The dimensions of \(M^+\) are equivalent to the dimensions of \(M^T\). The generalized inverse exists for any matrix, so the \(M\) may be singular or less than full rank.

Even though the generalized inverse exists for any arbitrary matrix, the derivatives of this function only exist on matrices of locally constant rank (Golub and Pereyra 1973), meaning, the derivatives do not exist if small perturbations make the matrix change rank. For example, considered the rank of the matrix \(A\) as a function of \(\epsilon\):

\[ A = \left( \begin{array}{cccc} 1 + \epsilon & 2 & 1 \\ 2 & 4 & 2 \end{array} \right) \]

When \(\epsilon = 0\), \(A\) is rank 1 because the second row is twice the first (and so there is only one linearly independent row). If \(\epsilon \neq 0\), the rows are no longer linearly dependent, and the matrix is rank 2. This matrix does not have locally constant rank at \(\epsilon = 0\), and so the derivatives do not exist at zero. Because HMC depends on the derivatives existing, this lack of differentiability creates undefined behavior.

`matrix`

`generalized_inverse`

`(matrix A)`

The generalized inverse of A
*Available since 2.26*

#### 6.14.5.5 Eigendecomposition

`vector`

`eigenvalues_sym`

`(matrix A)`

The vector of eigenvalues of a symmetric matrix A in ascending order
*Available since 2.0*

`matrix`

`eigenvectors_sym`

`(matrix A)`

The matrix with the (column) eigenvectors of symmetric matrix A in the
same order as returned by the function `eigenvalues_sym`

*Available since 2.0*

Because multiplying an eigenvector by \(-1\) results in an eigenvector, eigenvectors returned by a decomposition are only identified up to a sign change. In order to compare the eigenvectors produced by Stan’s eigendecomposition to others, signs may need to be normalized in some way, such as by fixing the sign of a component, or doing comparisons allowing a multiplication by \(-1\).

The condition number of a symmetric matrix is defined to be the ratio of the largest eigenvalue to the smallest eigenvalue. Large condition numbers lead to difficulty in numerical algorithms such as computing inverses, and thus known as “ill conditioned.” The ratio can even be infinite in the case of singular matrices (i.e., those with eigenvalues of 0).

#### 6.14.5.6 QR decomposition

`matrix`

`qr_thin_Q`

`(matrix A)`

The orthogonal matrix in the thin QR decomposition of A, which implies
that the resulting matrix has the same dimensions as A
*Available since 2.18*

`matrix`

`qr_thin_R`

`(matrix A)`

The upper triangular matrix in the thin QR decomposition of A, which
implies that the resulting matrix is square with the same number of
columns as A
*Available since 2.18*

`matrix`

`qr_Q`

`(matrix A)`

The orthogonal matrix in the fat QR decomposition of A, which implies
that the resulting matrix is square with the same number of rows as A
*Available since 2.3*

`matrix`

`qr_R`

`(matrix A)`

The upper trapezoidal matrix in the fat QR decomposition of A, which
implies that the resulting matrix will be rectangular with the same
dimensions as A
*Available since 2.3*

The thin QR decomposition is always preferable because it will consume much less memory when the input matrix is large than will the fat QR decomposition. Both versions of the decomposition represent the input matrix as \[ A = Q \, R. \] Multiplying a column of an orthogonal matrix by \(-1\) still results in an orthogonal matrix, and you can multiply the corresponding row of the upper trapezoidal matrix by \(-1\) without changing the product. Thus, Stan adopts the normalization that the diagonal elements of the upper trapezoidal matrix are strictly positive and the columns of the orthogonal matrix are reflected if necessary. Also, these QR decomposition algorithms do not utilize pivoting and thus may be numerically unstable on input matrices that have less than full rank.

#### 6.14.5.7 Cholesky decomposition

Every symmetric, positive-definite matrix (such as a correlation or covariance matrix) has a Cholesky decomposition. If \(\Sigma\) is a symmetric, positive-definite matrix, its Cholesky decomposition is the lower-triangular vector \(L\) such that \[ \Sigma = L \, L^{\top}. \]

`matrix`

`cholesky_decompose`

`(matrix A)`

The lower-triangular Cholesky factor of the symmetric
positive-definite matrix A
*Available since 2.0*

#### 6.14.5.8 Singular value decomposition

The matrix A can be decomposed into a diagonal matrix of singular values, D, and matrices of its left and right singular vectors, U and V, \[ A = U D V^T. \] The matrices of singular vectors here are thin. That is for an \(N\) by \(P\) input A, \(M = min(N, P)\), U is size \(N\) by \(M\) and V is size \(P\) by \(M\).

`vector`

`singular_values`

`(matrix A)`

The singular values of A in descending order
*Available since 2.0*

`vector`

`svd_U`

`(matrix A)`

The left-singular vectors of A
*Available since 2.26*

`vector`

`svd_V`

`(matrix A)`

The right-singular vectors of A
*Available since 2.26*

*References*

*SIAM Journal on Numerical Analysis*10 (2): 413–32. https://doi.org/10.1137/0710036.