10.9 Correlation Matrices

This is an old version, view current version.

A $K \times K$ correlation matrix $x$ must be is a symmetric, so that

$x_{k,k'} = x_{k',k}$

for all $k,k' \in \{ 1, \ldots, K \}$ , it must have a unit diagonal, so that

$x_{k,k} = 1$

for all $k \in \{ 1, \ldots, K \}$ , and it must be positive definite, so that for every non-zero $K$ -vector $a$ ,

$a^{\top} x a > 0.$

The number of free parameters required to specify a $K \times K$ correlation matrix is $\binom{K}{2}$ .

There is more than one way to map from $\binom{K}{2}$ unconstrained parameters to a $K \times K$ correlation matrix. Stan implements the Lewandowski-Kurowicka-Joe (LKJ) transform Lewandowski, Kurowicka, and Joe (2009).

Correlation Matrix Inverse Transform

It is easiest to specify the inverse, going from its $\binom{K}{2}$ parameter basis to a correlation matrix. The basis will actually be broken down into two steps. To start, suppose $y$ is a vector containing $\binom{K}{2}$ unconstrained values. These are first transformed via the bijective function $\tanh : \mathbb{R} \rightarrow (-1, 1)$

$\tanh x = \frac{\exp(2x) - 1}{\exp(2x) + 1}.$

Then, define a $K \times K$ matrix $z$ , the upper triangular values of which are filled by row with the transformed values. For example, in the $4 \times 4$ case, there are $\binom{4}{2}$ values arranged as

$z = \left[ \begin{array}{cccc} 0 & \tanh y_1 & \tanh y_2 & \tanh y_4 \\ 0 & 0 & \tanh y_3 & \tanh y_5 \\ 0 & 0 & 0 & \tanh y_6 \\ 0 & 0 & 0 & 0 \end{array} \right] .$

Lewandowski, Kurowicka and Joe (LKJ) show how to bijectively map the array $z$ to a correlation matrix $x$ . The entry $z_{i,j}$ for $i < j$ is interpreted as the canonical partial correlation (CPC) between $i$ and $j$ , which is the correlation between $i$ ’s residuals and $j$ ’s residuals when both $i$ and $j$ are regressed on all variables $i'$ such that $i'< i$ . In the case of $i=1$ , there are no earlier variables, so $z_{1,j}$ is just the Pearson correlation between $i$ and $j$ .

In Stan, the LKJ transform is reformulated in terms of a Cholesky factor $w$ of the final correlation matrix, defined for $1 \leq i,j \leq K$ by

$w_{i,j} = \left\{ \begin{array}{cl} 0 & \mbox{if } i > j, \\ 1 & \mbox{if } 1 = i = j, \\ \prod_{i'=1}^{i - 1} \left( 1 - z_{i'\!,\,j}^2 \right)^{1/2} & \mbox{if } 1 < i = j, \\ z_{i,j} & \mbox{if } 1 = i < j, \mbox{ and} \\\ z_{i,j} \, \prod_{i'=1}^{i-1} \left( 1 - z_{i'\!,\,j}^2 \right)^{1/2} & \mbox{ if } 1 < i < j. \end{array} \right.$

This does not require as much computation per matrix entry as it may appear; calculating the rows in terms of earlier rows yields the more manageable expression

$w_{i,j} = \left\{ \begin{array}{cl} 0 & \mbox{if } i > j, \\ 1 & \mbox{if } 1 = i = j, \\ z_{i,j} & \mbox{if } 1 = i < j, \mbox{ and} \\ z_{i,j} \ w_{i-1,j} \left( 1 - z_{i-1,j}^2 \right)^{1/2} & \mbox{ if } 1 < i \leq j. \end{array} \right.$

Given the upper-triangular Cholesky factor $w$ , the final correlation matrix is

$x = w^{\top} w.$

Lewandowski, Kurowicka, and Joe (2009) show that the determinant of the correlation matrix can be defined in terms of the canonical partial correlations as

$\mbox{det} \, x = \prod_{i=1}^{K-1} \ \prod_{j=i+1}^K \ (1 - z_{i,j}^2) = \prod_{1 \leq i < j \leq K} (1 - z_{i,j}^2),$

Absolute Jacobian Determinant of the Correlation Matrix Inverse Transform

From the inverse of equation 11 in (Lewandowski, Kurowicka, and Joe 2009), the absolute Jacobian determinant is

$\sqrt{\prod_{i=1}^{K-1}\prod_{j=i+1}^K \left(1-z_{i,j}^2\right)^{K-i-1}} \ \times \prod_{i=1}^{K-1}\prod_{j=i+1}^K \frac{\partial z_{i,j}}{\partial y_{i,j}}$

Correlation Matrix Transform

The correlation transform is defined by reversing the steps of the inverse transform defined in the previous section.

Starting with a correlation matrix $x$ , the first step is to find the unique upper triangular $w$ such that $x = w w^{\top}$ . Because $x$ is positive definite, this can be done by applying the Cholesky decomposition,

$w = \mbox{chol}(x).$

The next step from the Cholesky factor $w$ back to the array $z$ of canonical partial correlations (CPCs) is simplified by the ordering of the elements in the definition of $w$ , which when inverted yields

$z_{i,j} = \left\{ \begin{array}{cl} 0 & \mbox{if } i \leq j, \\ w_{i,j} & \mbox{if } 1 = i < j, \mbox{ and} \\ {w_{i,j}} \ \prod_{i'=1}^{i-1} \left( 1 - z_{i'\!,j}^2 \right)^{-1/2} & \mbox{if } 1 < i < j. \end{array} \right.$

The final stage of the transform reverses the hyperbolic tangent transform, which is defined by

$\tanh^{-1} v = \frac{1}{2} \log \left( \frac{1 + v}{1 - v} \right).$

The inverse hyperbolic tangent function, $\tanh^{-1}$ , is also called the Fisher transformation.