Multivariate Discrete Distributions
The multivariate discrete distributions are over multiple integer values, which are expressed in Stan as arrays.
Multinomial distribution
Probability mass function
If \(K \in \mathbb{N}\), \(N \in \mathbb{N}\), and \(\theta \in \text{$K$-simplex}\), then for \(y \in \mathbb{N}^K\) such that \(\sum_{k=1}^K y_k = N\), \[\begin{equation*} \text{Multinomial}(y|\theta) = \binom{N}{y_1,\ldots,y_K} \prod_{k=1}^K \theta_k^{y_k}, \end{equation*}\] where the multinomial coefficient is defined by \[\begin{equation*} \binom{N}{y_1,\ldots,y_k} = \frac{N!}{\prod_{k=1}^K y_k!}. \end{equation*}\]
Sampling statement
y ~
multinomial
(theta)
Increment target log probability density with multinomial_lupmf(y | theta)
.
Stan functions
real
multinomial_lpmf
(array[] int y | vector theta)
The log multinomial probability mass function with outcome array y
of size \(K\) given the \(K\)-simplex distribution parameter theta and (implicit) total count N = sum(y)
real
multinomial_lupmf
(array[] int y | vector theta)
The log multinomial probability mass function with outcome array y
of size \(K\) given the \(K\)-simplex distribution parameter theta and (implicit) total count N = sum(y)
dropping constant additive terms
array[] int
multinomial_rng
(vector theta, int N)
Generate a multinomial variate with simplex distribution parameter theta and total count \(N\); may only be used in transformed data and generated quantities blocks
Multinomial distribution, logit parameterization
Stan also provides a version of the multinomial probability mass function distribution with the \(\text{$K$-simplex}\) for the event count probabilities per category given on the unconstrained logistic scale.
Probability mass function
If \(K \in \mathbb{N}\), \(N \in \mathbb{N}\), and \(\text{softmax}(\theta) \in \text{$K$-simplex}\), then for \(y \in \mathbb{N}^K\) such that \(\sum_{k=1}^K y_k = N\), \[\begin{equation*} \begin{split} \text{MultinomialLogit}(y \mid \gamma) & = \text{Multinomial}(y \mid \text{softmax}(\gamma)) \\ & = \binom{N}{y_1,\ldots,y_K} \prod_{k=1}^K [\text{softmax}(\gamma_k)]^{y_k}, \end{split} \end{equation*}\] where the multinomial coefficient is defined by \[\begin{equation*} \binom{N}{y_1,\ldots,y_k} = \frac{N!}{\prod_{k=1}^K y_k!}. \end{equation*}\]
Sampling statement
y ~
multinomial_logit
(gamma)
Increment target log probability density with multinomial_logit_lupmf(y | gamma)
.
Stan functions
real
multinomial_logit_lpmf
(array[] int y | vector gamma)
The log multinomial probability mass function with outcome array y
of size \(K\) given the log \(K\)-simplex distribution parameter \(\gamma\) and (implicit) total count N = sum(y)
real
multinomial_logit_lupmf
(array[] int y | vector gamma)
The log multinomial probability mass function with outcome array y
of size \(K\) given the log \(K\)-simplex distribution parameter \(\gamma\) and (implicit) total count N = sum(y)
dropping constant additive terms
array[] int
multinomial_logit_rng
(vector gamma, int N)
Generate a variate from a multinomial distribution with probabilities softmax(gamma)
and total count N
; may only be used in transformed data and generated quantities blocks.
Dirichlet-multinomial distribution
Stan also provides the Dirichlet-multinomial distribution, which generalizes the Beta-binomial distribution to more than two categories. As such, it is an overdispersed version of the multinomial distribution.
Probability mass function
If \(K \in \mathbb{N}\), \(N \in \mathbb{N}\), and \(\alpha \in \mathbb{R}_{+}^K\), then for \(y \in \mathbb{N}^K\) such that \(\sum_{k=1}^K y_k = N\), the PMF of the Dirichlet-multinomial distribution is defined as \[\begin{equation*} \text{DirMult}(y|\theta) = \frac{\Gamma(\alpha_0)\Gamma(N+1)}{\Gamma(N+\alpha_0)} \prod_{k=1}^K \frac{\Gamma(y_k + \alpha_k)}{\Gamma(\alpha_k)\Gamma(y_k+1)}, \end{equation*}\] where \(\alpha_0\) is defined as \(\alpha_0 = \sum_{k=1}^K \alpha_k\).
Sampling statement
y ~
dirichlet_multinomial
(alpha)
Increment target log probability density with dirichlet_multinomial_lupmf(y | alpha)
.
Stan functions
real
dirichlet_multinomial_lpmf
(array[] int y | vector alpha)
The log multinomial probability mass function with outcome array y
with \(K\) elements given the positive \(K\)-vector distribution parameter alpha
and (implicit) total count N = sum(y)
.
real
dirichlet_multinomial_lupmf
(array[] int y | vector alpha)
The log multinomial probability mass function with outcome array y
with \(K\) elements, given the positive \(K\)-vector distribution parameter alpha
and (implicit) total count N = sum(y)
dropping constant additive terms.
array[] int
dirichlet_multinomial_rng
(vector alpha, int N)
Generate a multinomial variate with positive vector distribution parameter alpha
and total count N
; may only be used in transformed data and generated quantities blocks. This is equivalent to multinomial_rng(dirichlet_rng(alpha), N)
.