This is an old version, view current version.
24.6 Estimating event probabilities
Event probabilities involving either parameters or predictions or both may be coded in the generated quantities block. For example, to evaluate \(\textrm{Pr}[\lambda > 5 \mid y]\) in the simple Poisson example with only a rate parameter \(\lambda\), it suffices to define a generated quantity
generated quantities {
int<lower = 0, upper = 1> lambda_gt_5 = lambda > 5;
...
The value of the expression lambda > 5
is 1 if the condition is true and 0 otherwise. The posterior mean of this parameter is the event probability
\[\begin{eqnarray*}
\mbox{Pr}[\lambda > 5 \mid y]
& = &
\int \textrm{I}(\lambda > 5) \cdot p(\lambda \mid y)
\, \textrm{d}\lambda
\\[4pt]
& \approx &
\frac{1}{M} \sum_{m = 1}^M \textrm{I}[\lambda^{(m)} > 5],
\end{eqnarray*}\]
where each \(\lambda^{(m)} \sim p(\lambda \mid y)\) is distributed according to the posterior. In Stan, this is recovered as the posterior mean of the parameter lambda_gt_5
.
The last line above is the posterior mean of the indicator function as coded in Stan.
Event probabilities involving posterior predictive quantities \(\tilde{y}\) work exactly the same way as those for parameters. For example, if \(\tilde{y}_n\) is the prediction for the \(n\)-th unobserved outcome (such as the score of a team in a game or a level of expression of a protein in a cell), then \[\begin{eqnarray*} \mbox{Pr}[\tilde{y}_3 > \tilde{y}_7 \mid \tilde{x}, x, y] & = & \mathbb{E}\!\left[I[\tilde{y}_3 > \tilde{y}_7] \mid \tilde{x}, x, y\right] \\[4pt] & = & \int \textrm{I}(\tilde{y}_3 > \tilde{y}_7) \cdot p(\tilde{y} \mid \tilde{x}, x, y) \, \textrm{d}\tilde{y} \\[4pt] & \approx & \frac{1}{M} \sum_{m = 1}^M \textrm{I}(\tilde{y}^{(m)}_3 > \tilde{y}^{(m)}_7), \end{eqnarray*}\]where \(\tilde{y}^{(m)} \sim p(\tilde{y} \mid \tilde{x}, x, y).\)