This is an old version, view current version.

5.4 Vectorizing Mixtures

There is (currently) no way to vectorize mixture models at the observation level in Stan. This section is to warn users away from attempting to vectorize naively, as it results in a different model. A proper mixture at the observation level is defined as follows, where we assume that lambda, y[n], mu[1], mu[2], and sigma[1], sigma[2] are all scalars and lambda is between 0 and 1.

for (n in 1:N) {
  target += log_sum_exp(log(lambda)
                          + normal_lpdf(y[n] | mu[1], sigma[1]),
                        log1m(lambda)
                          + normal_lpdf(y[n] | mu[2], sigma[2]));

or equivalently

for (n in 1:N)
  target += log_mix(lambda,
                    normal_lpdf(y[n] | mu[1], sigma[1]),
                    normal_lpdf(y[n] | mu[2], sigma[2]));

This definition assumes that each observation yn may have arisen from either of the mixture components. The density is p(y|λ,μ,σ)=Nn=1(λnormal(yn|μ1,σ1)+(1λ)normal(yn|μ2,σ2).

Contrast the previous model with the following (erroneous) attempt to vectorize the model.

target += log_sum_exp(log(lambda)
                        + normal_lpdf(y | mu[1], sigma[1]),
                      log1m(lambda)
                        + normal_lpdf(y | mu[2], sigma[2]));

or equivalently,

target += log_mix(lambda,
                  normal_lpdf(y | mu[1], sigma[1]),
                  normal_lpdf(y | mu[2], sigma[2]));

This second definition implies that the entire sequence y1,,yn of observations comes form one component or the other, defining a different density, p(y|λ,μ,σ)=λNn=1normal(yn|μ1,σ1)+(1λ)Nn=1normal(yn|μ2,σ2).