31.4 Bagging
When bootstrapping is carried through inference it is known as bootstrap aggregation, or bagging, in the machine-learning literature (Breiman 1996). In the simplest case, this involves bootstrapping the original data, fitting a model to each bootstrapped data set, then averaging the predictions. For instance, rather than using an estimate \(\hat{\sigma}\) from the original data set, bootstrapped data sets \(y^{\textrm{boot}(1)}, \ldots, y^{\textrm{boot}(N)}\) are generated. Each is used to generate an estimate \(\hat{\sigma}^{\textrm{boot}(n)}.\) The final estimate is \[ \hat{\sigma} = \frac{1}{N} \sum_{n = 1}^N \hat{\sigma}^{\textrm{boot}(n)}. \] The same would be done to estimate a predictive quantity \(\tilde{y}\) for as yet unseen data. \[ \hat{\tilde{y}} = \frac{1}{N} \sum_{n = 1}^N \hat{\tilde{y}}^{\textrm{boot}(n)}. \] For discrete parameters, voting is used to select the outcome.
One way of viewing bagging is as a classical attempt to get something like averaging over parameter estimation uncertainty.