10.8 Unit Vector
An \(n\)-dimensional vector \(x \in \mathbb{R}^n\) is said to be a unit vector if it has unit Euclidean length, so that
\[ \Vert x \Vert \ = \ \sqrt{x^{\top}\,x} \ = \ \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2} \ = \ 1\ . \]
Unit Vector Inverse Transform
Stan divides an unconstrained vector \(y \in \mathbb{R}^{n}\) by its norm, \(\Vert y \Vert = \sqrt{y^\top y}\), to obtain a unit vector \(x\),
\[ x = \frac{y}{\Vert y \Vert}. \]
To generate a unit vector, Stan generates points at random in \(\mathbb{R}^n\) with independent unit normal distributions, which are then standardized by dividing by their Euclidean length. Marsaglia (1972) showed this generates points uniformly at random on \(S^{n-1}\). That is, if we draw \(y_n \sim \mathsf{Normal}(0, 1)\) for \(n \in 1{:}n\), then \(x = \frac{y}{\Vert y \Vert}\) has a uniform distribution over \(S^{n-1}\). This allows us to use an \(n\)-dimensional basis for \(S^{n-1}\) that preserves local neighborhoods in that points that are close to each other in \(\mathbb{R}^n\) map to points near each other in \(S^{n-1}\). The mapping is not perfectly distance preserving, because there are points arbitrarily far away from each other in \(\mathbb{R}^n\) that map to identical points in \(S^{n-1}\).
Warning: undefined at zero!
The above mapping from \(\mathbb{R}^n\) to \(S^n\) is not defined at zero. While this point outcome has measure zero during sampling, and may thus be ignored, it is the default initialization point and thus unit vector parameters cannot be initialized at zero. A simple workaround is to initialize from a very small interval around zero, which is an option built into all of the Stan interfaces.
Absolute Jacobian Determinant of the Unit Vector Inverse Transform
The Jacobian matrix relating the input vector \(y\) to the output vector \(x\) is singular because \(x^\top x = 1\) for any non-zero input vector \(y\). Thus, there technically is no unique transformation from \(x\) to \(y\). To circumvent this issue, let \(r = \sqrt{y^\top y}\) so that \(y = r x\). The transformation from \(\left(r, x_{-n}\right)\) to \(y\) is well-defined but \(r\) is arbitrary, so we set \(r = 1\). In this case, the determinant of the Jacobian is proportional to \(e^{-\frac{1}{2} y^\top y}\), which is the kernel of a standard multivariate normal distribution with \(n\) independent dimensions.
References
Marsaglia, George. 1972. “Choosing a Point from the Surface of a Sphere.” The Annals of Mathematical Statistics 43 (2): 645–46.