# Multivariate Discrete Distributions

The multivariate discrete distributions are over multiple integer values, which are expressed in Stan as arrays.

## Multinomial distribution

### Probability mass function

If $$K \in \mathbb{N}$$, $$N \in \mathbb{N}$$, and $$\theta \in \text{K-simplex}$$, then for $$y \in \mathbb{N}^K$$ such that $$\sum_{k=1}^K y_k = N$$, $\begin{equation*} \text{Multinomial}(y|\theta) = \binom{N}{y_1,\ldots,y_K} \prod_{k=1}^K \theta_k^{y_k}, \end{equation*}$ where the multinomial coefficient is defined by $\begin{equation*} \binom{N}{y_1,\ldots,y_k} = \frac{N!}{\prod_{k=1}^K y_k!}. \end{equation*}$

### Distribution statement

y ~ multinomial(theta)

Increment target log probability density with multinomial_lupmf(y | theta).

Available since 2.0

### Stan functions

real multinomial_lpmf(array[] int y | vector theta)
The log multinomial probability mass function with outcome array y of size $$K$$ given the $$K$$-simplex distribution parameter theta and (implicit) total count N = sum(y)

Available since 2.12

real multinomial_lupmf(array[] int y | vector theta)
The log multinomial probability mass function with outcome array y of size $$K$$ given the $$K$$-simplex distribution parameter theta and (implicit) total count N = sum(y) dropping constant additive terms

Available since 2.25

array[] int multinomial_rng(vector theta, int N)
Generate a multinomial variate with simplex distribution parameter theta and total count $$N$$; may only be used in transformed data and generated quantities blocks

Available since 2.8

## Multinomial distribution, logit parameterization

Stan also provides a version of the multinomial probability mass function distribution with the $$\text{K-simplex}$$ for the event count probabilities per category given on the unconstrained logistic scale.

### Probability mass function

If $$K \in \mathbb{N}$$, $$N \in \mathbb{N}$$, and $$\text{softmax}(\theta) \in \text{K-simplex}$$, then for $$y \in \mathbb{N}^K$$ such that $$\sum_{k=1}^K y_k = N$$, $\begin{equation*} \begin{split} \text{MultinomialLogit}(y \mid \gamma) & = \text{Multinomial}(y \mid \text{softmax}(\gamma)) \\ & = \binom{N}{y_1,\ldots,y_K} \prod_{k=1}^K [\text{softmax}(\gamma_k)]^{y_k}, \end{split} \end{equation*}$ where the multinomial coefficient is defined by $\begin{equation*} \binom{N}{y_1,\ldots,y_k} = \frac{N!}{\prod_{k=1}^K y_k!}. \end{equation*}$

### Distribution statement

y ~ multinomial_logit(gamma)

Increment target log probability density with multinomial_logit_lupmf(y | gamma).

Available since 2.24

### Stan functions

real multinomial_logit_lpmf(array[] int y | vector gamma)
The log multinomial probability mass function with outcome array y of size $$K$$ given the log $$K$$-simplex distribution parameter $$\gamma$$ and (implicit) total count N = sum(y)

Available since 2.24

real multinomial_logit_lupmf(array[] int y | vector gamma)
The log multinomial probability mass function with outcome array y of size $$K$$ given the log $$K$$-simplex distribution parameter $$\gamma$$ and (implicit) total count N = sum(y) dropping constant additive terms

Available since 2.25

array[] int multinomial_logit_rng(vector gamma, int N)
Generate a variate from a multinomial distribution with probabilities softmax(gamma) and total count N; may only be used in transformed data and generated quantities blocks.

Available since 2.24

## Dirichlet-multinomial distribution

Stan also provides the Dirichlet-multinomial distribution, which generalizes the Beta-binomial distribution to more than two categories. As such, it is an overdispersed version of the multinomial distribution.

### Probability mass function

If $$K \in \mathbb{N}$$, $$N \in \mathbb{N}$$, and $$\alpha \in \mathbb{R}_{+}^K$$, then for $$y \in \mathbb{N}^K$$ such that $$\sum_{k=1}^K y_k = N$$, the PMF of the Dirichlet-multinomial distribution is defined as $\begin{equation*} \text{DirMult}(y|\theta) = \frac{\Gamma(\alpha_0)\Gamma(N+1)}{\Gamma(N+\alpha_0)} \prod_{k=1}^K \frac{\Gamma(y_k + \alpha_k)}{\Gamma(\alpha_k)\Gamma(y_k+1)}, \end{equation*}$ where $$\alpha_0$$ is defined as $$\alpha_0 = \sum_{k=1}^K \alpha_k$$.

### Distribution statement

y ~ dirichlet_multinomial(alpha)

Increment target log probability density with dirichlet_multinomial_lupmf(y | alpha).

Available since 2.34

### Stan functions

real dirichlet_multinomial_lpmf(array[] int y | vector alpha)
The log multinomial probability mass function with outcome array y with $$K$$ elements given the positive $$K$$-vector distribution parameter alpha and (implicit) total count N = sum(y).

Available since 2.34

real dirichlet_multinomial_lupmf(array[] int y | vector alpha)
The log multinomial probability mass function with outcome array y with $$K$$ elements, given the positive $$K$$-vector distribution parameter alpha and (implicit) total count N = sum(y) dropping constant additive terms.

Available since 2.34

array[] int dirichlet_multinomial_rng(vector alpha, int N)
Generate a multinomial variate with positive vector distribution parameter alpha and total count N; may only be used in transformed data and generated quantities blocks. This is equivalent to multinomial_rng(dirichlet_rng(alpha), N).

Available since 2.34