25.6 Estimating event probabilities

Event probabilities involving either parameters or predictions or both may be coded in the generated quantities block. For example, to evaluate \(\textrm{Pr}[\lambda > 5 \mid y]\) in the simple Poisson example with only a rate parameter \(\lambda\), it suffices to define a generated quantity

generated quantities {
  int<lower = 0, upper = 1> lambda_gt_5 = lambda > 5;
  ...

The value of the expression lambda > 5 is 1 if the condition is true and 0 otherwise. The posterior mean of this parameter is the event probability \[\begin{eqnarray*} \mbox{Pr}[\lambda > 5 \mid y] & = & \int \textrm{I}(\lambda > 5) \cdot p(\lambda \mid y) \, \textrm{d}\lambda \\[4pt] & \approx & \frac{1}{M} \sum_{m = 1}^M \textrm{I}[\lambda^{(m)} > 5], \end{eqnarray*}\] where each \(\lambda^{(m)} \sim p(\lambda \mid y)\) is distributed according to the posterior. In Stan, this is recovered as the posterior mean of the parameter lambda_gt_5.

In general, event probabilities may be expressed as expectations of indicator functions. For example, \[\begin{eqnarray*} \textrm{Pr}[\lambda > 5 \mid y] & = & \mathbb{E}[\textrm{I}[\lambda > 5] \mid y] \\[4pt] & = & \int \textrm{I}(\lambda > 5) \cdot p(\lambda \mid y) \, \textrm{d}\lambda \\[4pt] & \approx & \frac{1}{M} \sum_{m = 1}^M \textrm{I}(\lambda^{(m)} > 5). \end{eqnarray*}\] The last line above is the posterior mean of the indicator function as coded in Stan.

Event probabilities involving posterior predictive quantities \(\tilde{y}\) work exactly the same way as those for parameters. For example, if \(\tilde{y}_n\) is the prediction for the \(n\)-th unobserved outcome (such as the score of a team in a game or a level of expression of a protein in a cell), then \[\begin{eqnarray*} \mbox{Pr}[\tilde{y}_3 > \tilde{y}_7 \mid \tilde{x}, x, y] & = & \mathbb{E}\!\left[I[\tilde{y}_3 > \tilde{y}_7] \mid \tilde{x}, x, y\right] \\[4pt] & = & \int \textrm{I}(\tilde{y}_3 > \tilde{y}_7) \cdot p(\tilde{y} \mid \tilde{x}, x, y) \, \textrm{d}\tilde{y} \\[4pt] & \approx & \frac{1}{M} \sum_{m = 1}^M \textrm{I}(\tilde{y}^{(m)}_3 > \tilde{y}^{(m)}_7), \end{eqnarray*}\] where \(\tilde{y}^{(m)} \sim p(\tilde{y} \mid \tilde{x}, x, y).\)