33.4 Bagging

This is an old version, view current version.

When bootstrapping is carried through inference it is known as bootstrap aggregation, or bagging, in the machine-learning literature (Breiman 1996). In the simplest case, this involves bootstrapping the original data, fitting a model to each bootstrapped data set, then averaging the predictions. For instance, rather than using an estimate \(\hat{\sigma}\) from the original data set, bootstrapped data sets \(y^{\textrm{boot}(1)}, \ldots, y^{\textrm{boot}(N)}\) are generated. Each is used to generate an estimate \(\hat{\sigma}^{\textrm{boot}(n)}.\) The final estimate is \[ \hat{\sigma} = \frac{1}{N} \sum_{n = 1}^N \hat{\sigma}^{\textrm{boot}(n)}. \] The same would be done to estimate a predictive quantity \(\tilde{y}\) for as yet unseen data. \[ \hat{\tilde{y}} = \frac{1}{N} \sum_{n = 1}^N \hat{\tilde{y}}^{\textrm{boot}(n)}. \] For discrete parameters, voting is used to select the outcome.

One way of viewing bagging is as a classical attempt to get something like averaging over parameter estimation uncertainty.

References

Breiman, Leo. 1996. “Bagging Predictors.” Machine Learning 24 (2): 123–40.