10.11 Cholesky Factors of Covariance Matrices

This is an old version, view current version.

10.11 Cholesky Factors of Covariance Matrices

An $M \times M$ covariance matrix $\Sigma$ can be Cholesky factored to a lower triangular matrix $L$ such that $L\,L^{\top} = \Sigma$ . If $\Sigma$ is positive definite, then $L$ will be $M \times M$ . If $\Sigma$ is only positive semi-definite, then $L$ will be $M \times N$ , with $N < M$ .

A matrix is a Cholesky factor for a covariance matrix if and only if it is lower triangular, the diagonal entries are positive, and $M \geq N$ . A matrix satisfying these conditions ensures that $L \, L^{\top}$ is positive semi-definite if $M > N$ and positive definite if $M = N$ .

A Cholesky factor of a covariance matrix requires $N + \binom{N}{2} + (M - N)N$ unconstrained parameters.

Cholesky Factor of Covariance Matrix Transform

Stan’s Cholesky factor transform only requires the first step of the covariance matrix transform, namely log transforming the positive diagonal elements. Suppose $x$ is an $M \times N$ Cholesky factor. The above-diagonal entries are zero, the diagonal entries are positive, and the below-diagonal entries are unconstrained. The transform required is thus

$y_{m,n} = \left\{ \begin{array}{cl} 0 & \mbox{if } m < n, \\ \log x_{m,m} & \mbox{if } m = n, \mbox{ and} \\ x_{m,n} & \mbox{if } m > n. \end{array} \right.$

Cholesky Factor of Covariance Matrix Inverse Transform

The inverse transform need only invert the logarithm with an exponentiation. If $y$ is the unconstrained matrix representation, then the elements of the constrained matrix $x$ is defined by

$x_{m,n} = \left\{ \begin{array}{cl} 0 & \mbox{if } m < n, \\ \exp(y_{m,m}) & \mbox{if } m = n, \mbox{ and} \\ y_{m,n} & \mbox{if } m > n. \end{array} \right.$

Absolute Jacobian Determinant of Cholesky Factor Inverse Transform

The transform has a diagonal Jacobian matrix, the absolute determinant of which is

$\prod_{n=1}^N \frac{\partial}{\partial_{y_{n,n}}} \, \exp(y_{n,n}) \ = \ \prod_{n=1}^N \exp(y_{n,n}) \ = \ \prod_{n=1}^N x_{n,n}.$

Let $x = f^{-1}(y)$ be the inverse transform from a $N + \binom{N}{2} + (M - N)N$ vector to an $M \times N$ Cholesky factor for a covariance matrix $x$ defined in the previous section. A density function $p_X(x)$ defined on $M \times N$ Cholesky factors of covariance matrices is transformed to the density $p_Y(y)$ over $N + \binom{N}{2} + (M - N)N$ vectors $y$ by

$p_Y(y) = p_X(f^{-1}(y)) \prod_{N=1}^N x_{n,n}.$