Multivariate Discrete Distributions

The multivariate discrete distributions are over multiple integer values, which are expressed in Stan as arrays.

Multinomial distribution

Probability mass function

If \(K \in \mathbb{N}\), \(N \in \mathbb{N}\), and \(\theta \in \text{$K$-simplex}\), then for \(y \in \mathbb{N}^K\) such that \(\sum_{k=1}^K y_k = N\), \[\begin{equation*} \text{Multinomial}(y|\theta) = \binom{N}{y_1,\ldots,y_K} \prod_{k=1}^K \theta_k^{y_k}, \end{equation*}\] where the multinomial coefficient is defined by \[\begin{equation*} \binom{N}{y_1,\ldots,y_k} = \frac{N!}{\prod_{k=1}^K y_k!}. \end{equation*}\]

Sampling statement

y ~ multinomial(theta)

Increment target log probability density with multinomial_lupmf(y | theta).

Available since 2.0

Stan functions

real multinomial_lpmf(array[] int y | vector theta)
The log multinomial probability mass function with outcome array y of size \(K\) given the \(K\)-simplex distribution parameter theta and (implicit) total count N = sum(y)

Available since 2.12

real multinomial_lupmf(array[] int y | vector theta)
The log multinomial probability mass function with outcome array y of size \(K\) given the \(K\)-simplex distribution parameter theta and (implicit) total count N = sum(y) dropping constant additive terms

Available since 2.25

array[] int multinomial_rng(vector theta, int N)
Generate a multinomial variate with simplex distribution parameter theta and total count \(N\); may only be used in transformed data and generated quantities blocks

Available since 2.8

Multinomial distribution, logit parameterization

Stan also provides a version of the multinomial probability mass function distribution with the \(\text{$K$-simplex}\) for the event count probabilities per category given on the unconstrained logistic scale.

Probability mass function

If \(K \in \mathbb{N}\), \(N \in \mathbb{N}\), and \(\text{softmax}(\theta) \in \text{$K$-simplex}\), then for \(y \in \mathbb{N}^K\) such that \(\sum_{k=1}^K y_k = N\), \[\begin{equation*} \begin{split} \text{MultinomialLogit}(y \mid \gamma) & = \text{Multinomial}(y \mid \text{softmax}(\gamma)) \\ & = \binom{N}{y_1,\ldots,y_K} \prod_{k=1}^K [\text{softmax}(\gamma_k)]^{y_k}, \end{split} \end{equation*}\] where the multinomial coefficient is defined by \[\begin{equation*} \binom{N}{y_1,\ldots,y_k} = \frac{N!}{\prod_{k=1}^K y_k!}. \end{equation*}\]

Sampling statement

y ~ multinomial_logit(gamma)

Increment target log probability density with multinomial_logit_lupmf(y | gamma).

Available since 2.24

Stan functions

real multinomial_logit_lpmf(array[] int y | vector gamma)
The log multinomial probability mass function with outcome array y of size \(K\) given the log \(K\)-simplex distribution parameter \(\gamma\) and (implicit) total count N = sum(y)

Available since 2.24

real multinomial_logit_lupmf(array[] int y | vector gamma)
The log multinomial probability mass function with outcome array y of size \(K\) given the log \(K\)-simplex distribution parameter \(\gamma\) and (implicit) total count N = sum(y) dropping constant additive terms

Available since 2.25

array[] int multinomial_logit_rng(vector gamma, int N)
Generate a variate from a multinomial distribution with probabilities softmax(gamma) and total count N; may only be used in transformed data and generated quantities blocks.

Available since 2.24

Dirichlet-multinomial distribution

Stan also provides the Dirichlet-multinomial distribution, which generalizes the Beta-binomial distribution to more than two categories. As such, it is an overdispersed version of the multinomial distribution.

Probability mass function

If \(K \in \mathbb{N}\), \(N \in \mathbb{N}\), and \(\alpha \in \mathbb{R}_{+}^K\), then for \(y \in \mathbb{N}^K\) such that \(\sum_{k=1}^K y_k = N\), the PMF of the Dirichlet-multinomial distribution is defined as \[\begin{equation*} \text{DirMult}(y|\theta) = \frac{\Gamma(\alpha_0)\Gamma(N+1)}{\Gamma(N+\alpha_0)} \prod_{k=1}^K \frac{\Gamma(y_k + \alpha_k)}{\Gamma(\alpha_k)\Gamma(y_k+1)}, \end{equation*}\] where \(\alpha_0\) is defined as \(\alpha_0 = \sum_{k=1}^K \alpha_k\).

Sampling statement

y ~ dirichlet_multinomial(alpha)

Increment target log probability density with dirichlet_multinomial_lupmf(y | alpha).

Available since 2.34

Stan functions

real dirichlet_multinomial_lpmf(array[] int y | vector alpha)
The log multinomial probability mass function with outcome array y with \(K\) elements given the positive \(K\)-vector distribution parameter alpha and (implicit) total count N = sum(y).

Available since 2.34

real dirichlet_multinomial_lupmf(array[] int y | vector alpha)
The log multinomial probability mass function with outcome array y with \(K\) elements, given the positive \(K\)-vector distribution parameter alpha and (implicit) total count N = sum(y) dropping constant additive terms.

Available since 2.34

array[] int dirichlet_multinomial_rng(vector alpha, int N)
Generate a multinomial variate with positive vector distribution parameter alpha and total count N; may only be used in transformed data and generated quantities blocks. This is equivalent to multinomial_rng(dirichlet_rng(alpha), N).

Available since 2.34
Back to top