10.9 Correlation Matrices
A \(K \times K\) correlation matrix \(x\) must be is a symmetric, so that
\[ x_{k,k'} = x_{k',k} \]
for all \(k,k' \in \{ 1, \ldots, K \}\), it must have a unit diagonal, so that
\[ x_{k,k} = 1 \]
for all \(k \in \{ 1, \ldots, K \}\), and it must be positive definite, so that for every non-zero \(K\)-vector \(a\),
\[ a^{\top} x a > 0. \]
The number of free parameters required to specify a \(K \times K\) correlation matrix is \(\binom{K}{2}\).
There is more than one way to map from \(\binom{K}{2}\) unconstrained parameters to a \(K \times K\) correlation matrix. Stan implements the Lewandowski-Kurowicka-Joe (LKJ) transform Lewandowski, Kurowicka, and Joe (2009).
Correlation Matrix Inverse Transform
It is easiest to specify the inverse, going from its \(\binom{K}{2}\) parameter basis to a correlation matrix. The basis will actually be broken down into two steps. To start, suppose \(y\) is a vector containing \(\binom{K}{2}\) unconstrained values. These are first transformed via the bijective function \(\tanh : \mathbb{R} \rightarrow (-1, 1)\)
\[ \tanh x = \frac{\exp(2x) - 1}{\exp(2x) + 1}. \]
Then, define a \(K \times K\) matrix \(z\), the upper triangular values of which are filled by row with the transformed values. For example, in the \(4 \times 4\) case, there are \(\binom{4}{2}\) values arranged as
\[ z = \left[ \begin{array}{cccc} 0 & \tanh y_1 & \tanh y_2 & \tanh y_4 \\ 0 & 0 & \tanh y_3 & \tanh y_5 \\ 0 & 0 & 0 & \tanh y_6 \\ 0 & 0 & 0 & 0 \end{array} \right] . \]
Lewandowski, Kurowicka and Joe (LKJ) show how to bijectively map the array \(z\) to a correlation matrix \(x\). The entry \(z_{i,j}\) for \(i < j\) is interpreted as the canonical partial correlation (CPC) between \(i\) and \(j\), which is the correlation between \(i\)’s residuals and \(j\)’s residuals when both \(i\) and \(j\) are regressed on all variables \(i'\) such that \(i'< i\). In the case of \(i=1\), there are no earlier variables, so \(z_{1,j}\) is just the Pearson correlation between \(i\) and \(j\).
In Stan, the LKJ transform is reformulated in terms of a Cholesky factor \(w\) of the final correlation matrix, defined for \(1 \leq i,j \leq K\) by
\[ w_{i,j} = \left\{ \begin{array}{cl} 0 & \mbox{if } i > j, \\ 1 & \mbox{if } 1 = i = j, \\ \prod_{i'=1}^{i - 1} \left( 1 - z_{i'\!,\,j}^2 \right)^{1/2} & \mbox{if } 1 < i = j, \\ z_{i,j} & \mbox{if } 1 = i < j, \mbox{ and} \\\ z_{i,j} \, \prod_{i'=1}^{i-1} \left( 1 - z_{i'\!,\,j}^2 \right)^{1/2} & \mbox{ if } 1 < i < j. \end{array} \right. \]
This does not require as much computation per matrix entry as it may appear; calculating the rows in terms of earlier rows yields the more manageable expression
\[ w_{i,j} = \left\{ \begin{array}{cl} 0 & \mbox{if } i > j, \\ 1 & \mbox{if } 1 = i = j, \\ z_{i,j} & \mbox{if } 1 = i < j, \mbox{ and} \\ z_{i,j} \ w_{i-1,j} \left( 1 - z_{i-1,j}^2 \right)^{1/2} & \mbox{ if } 1 < i \leq j. \end{array} \right. \]
Given the upper-triangular Cholesky factor \(w\), the final correlation matrix is
\[ x = w^{\top} w. \]
Lewandowski, Kurowicka, and Joe (2009) show that the determinant of the correlation matrix can be defined in terms of the canonical partial correlations as
\[ \mbox{det} \, x = \prod_{i=1}^{K-1} \ \prod_{j=i+1}^K \ (1 - z_{i,j}^2) = \prod_{1 \leq i < j \leq K} (1 - z_{i,j}^2), \]
Absolute Jacobian Determinant of the Correlation Matrix Inverse Transform
From the inverse of equation 11 in (Lewandowski, Kurowicka, and Joe 2009), the absolute Jacobian determinant is
\[ \sqrt{\prod_{i=1}^{K-1}\prod_{j=i+1}^K \left(1-z_{i,j}^2\right)^{K-i-1}} \ \times \prod_{i=1}^{K-1}\prod_{j=i+1}^K \frac{\partial z_{i,j}}{\partial y_{i,j}} \]
Correlation Matrix Transform
The correlation transform is defined by reversing the steps of the inverse transform defined in the previous section.
Starting with a correlation matrix \(x\), the first step is to find the unique upper triangular \(w\) such that \(x = w w^{\top}\). Because \(x\) is positive definite, this can be done by applying the Cholesky decomposition,
\[ w = \mbox{chol}(x). \]
The next step from the Cholesky factor \(w\) back to the array \(z\) of canonical partial correlations (CPCs) is simplified by the ordering of the elements in the definition of \(w\), which when inverted yields
\[ z_{i,j} = \left\{ \begin{array}{cl} 0 & \mbox{if } i \leq j, \\ w_{i,j} & \mbox{if } 1 = i < j, \mbox{ and} \\ {w_{i,j}} \ \prod_{i'=1}^{i-1} \left( 1 - z_{i'\!,j}^2 \right)^{-1/2} & \mbox{if } 1 < i < j. \end{array} \right. \]
The final stage of the transform reverses the hyperbolic tangent transform, which is defined by
\[ \tanh^{-1} v = \frac{1}{2} \log \left( \frac{1 + v}{1 - v} \right). \]
The inverse hyperbolic tangent function, \(\tanh^{-1}\), is also called the Fisher transformation.