This is an old version, view current version.

20.4 Posteriors with Unbounded Densities

In some cases, the posterior density grows without bounds as parameters approach certain poles or boundaries. In such, there are no posterior modes and numerical stability issues can arise as sampled parameters approach constraint boundaries.

Mixture Models with Varying Scales

One such example is a binary mixture model with scales varying by component, σ1 and σ2 for locations μ1 and μ2. In this situation, the density grows without bound as σ10 and μ1yn for some n; that is, one of the mixture components concentrates all of its mass around a single data item yn.

Beta-Binomial Models with Skewed Data and Weak Priors

Another example of unbounded densities arises with a posterior such as Beta(ϕ|0.5,0.5), which can arise if seemingly weak beta priors are used for groups that have no data. This density is unbounded as ϕ0 and ϕ1. Similarly, a Bernoulli likelihood model coupled with a “weak” beta prior, leads to a posterior

p(ϕ|y)Beta(ϕ|0.5,0.5)Nn=1Bernoulli(yn|ϕ)=Beta(ϕ|0.5+Nn=1yn,  0.5+NNn=1yn).

If N=9 and each yn=1, the posterior is Beta(ϕ|9.5,0,5). This posterior is unbounded as ϕ1. Nevertheless, the posterior is proper, and although there is no posterior mode, the posterior mean is well-defined with a value of exactly 0.95.

Constrained vs. Unconstrained Scales

Stan does not sample directly on the constrained (0,1) space for this problem, so it doesn’t directly deal with unconstrained density values. Rather, the probability values ϕ are logit-transformed to (,). The boundaries at 0 and 1 are pushed out to and respectively. The Jacobian adjustment that Stan automatically applies ensures the unconstrained density is proper. The adjustment for the particular case of (0,1) is loglogit1(ϕ)+loglogit(1ϕ).

There are two problems that still arise, though. The first is that if the posterior mass for ϕ is near one of the boundaries, the logit-transformed parameter will have to sweep out long paths and thus can dominate the U-turn condition imposed by the no-U-turn sampler (NUTS). The second issue is that the inverse transform from the unconstrained space to the constrained space can underflow to 0 or overflow to 1, even when the unconstrained parameter is not infinite. Similar problems arise for the expectation terms in logistic regression, which is why the logit-scale parameterizations of the Bernoulli and binomial distributions are more stable.