10.1 Changes of Variables

The support of a random variable \(X\) with density \(p_X(x)\) is that subset of values for which it has non-zero density,

\[ \mathrm{supp}(X) = \{ x | p_X(x) > 0 \}. \]

If \(f\) is a total function defined on the support of \(X\), then \(Y = f(X)\) is a new random variable. This section shows how to compute the probability density function of \(Y\) for well-behaved transforms \(f\). The rest of the chapter details the transforms used by Stan.

Univariate Changes of Variables

Suppose \(X\) is one dimensional and \(f: \mathrm{supp}(X) \rightarrow \mathbb{R}\) is a one-to-one, monotonic function with a differentiable inverse \(f^{-1}\). Then the density of \(Y\) is given by

\[ p_Y(y) = p_X(f^{-1}(y)) \, \left| \, \frac{d}{dy} f^{-1}(y)\, \right|. \]

The absolute derivative of the inverse transform measures how the scale of the transformed variable changes with respect to the underlying variable.

Multivariate Changes of Variables

The multivariate generalization of an absolute derivative is a Jacobian, or more fully the absolute value of the determinant of the Jacobian matrix of the transform. The Jacobian matrix measures the change of each output variable relative to every input variable and the absolute determinant uses that to determine the differential change in volume at a given point in the parameter space.

Suppose \(X\) is a \(K\)-dimensional random variable with probability density function \(p_X(x)\). A new random variable \(Y = f(X)\) may be defined by transforming \(X\) with a suitably well-behaved function \(f\). It suffices for what follows to note that if \(f\) is one-to-one and its inverse \(f^{-1}\) has a well-defined Jacobian, then the density of \(Y\) is

\[ p_Y(y) = p_X(f^{-1}(y)) \, \left| \, \det \, J_{f^{-1}}(y) \, \right|, \]

where \(\det{}\) is the matrix determinant operation and \(J_{f^{-1}}(y)\) is the Jacobian matrix of \(f^{-1}\) evaluated at \(y\). Taking \(x = f^{-1}(y)\), the Jacobian matrix is defined by

\[ J_{f^{-1}}(y) = \left[ \begin{array}{ccc}\displaystyle \frac{\partial x_1}{\partial y_1} & \cdots & \displaystyle \frac{\partial x_1}{\partial y_{K}} \\ \vdots & \vdots & \vdots \\ \displaystyle\frac{\partial x_{K}}{\partial y_1} & \cdots & \displaystyle\frac{\partial x_{K}}{\partial y_{K}} \end{array} \right]. \]

If the Jacobian matrix is triangular, the determinant reduces to the product of the diagonal entries,

\[ \det \, J_{f^{-1}}(y) = \prod_{k=1}^K \frac{\partial x_k}{\partial y_k}. \]

Triangular matrices naturally arise in situations where the variables are ordered, for instance by dimension, and each variable’s transformed value depends on the previous variable’s transformed values. Diagonal matrices, a simple form of triangular matrix, arise if each transformed variable only depends on a single untransformed variable.