This is an old version, view current version.

## 5.2 Latent Discrete Parameterization

One way to parameterize a mixture model is with a latent categorical variable indicating which mixture component was responsible for the outcome. For example, consider $$K$$ normal distributions with locations $$\mu_k \in \mathbb{R}$$ and scales $$\sigma_k \in (0,\infty)$$. Now consider mixing them in proportion $$\lambda$$, where $$\lambda_k \geq 0$$ and $$\sum_{k=1}^K \lambda_k = 1$$ (i.e., $$\lambda$$ lies in the unit $$K$$-simplex). For each outcome $$y_n$$ there is a latent variable $$z_n$$ in $$\{ 1,\ldots,K \}$$ with a categorical distribution parameterized by $$\lambda$$,

$z_n \sim \mathsf{Categorical}(\lambda).$

The variable $$y_n$$ is distributed according to the parameters of the mixture component $$z_n$$, $y_n \sim \mathsf{normal}(\mu_{z[n]},\sigma_{z[n]}).$

This model is not directly supported by Stan because it involves discrete parameters $$z_n$$, but Stan can sample $$\mu$$ and $$\sigma$$ by summing out the $$z$$ parameter as described in the next section.