5.2 Latent discrete parameterization

This is an old version, view current version.

5.2 Latent discrete parameterization

One way to parameterize a mixture model is with a latent categorical variable indicating which mixture component was responsible for the outcome. For example, consider $K$ normal distributions with locations $\mu_k \in \mathbb{R}$ and scales $\sigma_k \in (0,\infty)$ . Now consider mixing them in proportion $\lambda$ , where $\lambda_k \geq 0$ and $\sum_{k=1}^K \lambda_k = 1$ (i.e., $\lambda$ lies in the unit $K$ -simplex). For each outcome $y_n$ there is a latent variable $z_n$ in $\{ 1,\dotsc,K \}$ with a categorical distribution parameterized by $\lambda$ , $z_n \sim \textsf{categorical}(\lambda).$

The variable $y_n$ is distributed according to the parameters of the mixture component $z_n$ , $y_n \sim \textsf{normal}(\mu_{z[n]},\sigma_{z[n]}).$

This model is not directly supported by Stan because it involves discrete parameters $z_n$ , but Stan can sample $\mu$ and $\sigma$ by summing out the $z$ parameter as described in the next section.