31.1 The bootstrap

31.1.1 Estimators

An estimator is nothing more than a function mapping a data set to one or more numbers, which are called “estimates.” For example, the mean function maps a data set \(y_{1,\ldots, N}\) to a number by \[ \textrm{mean}(y) = \frac{1}{N} \sum_{n=1}^N y_n, \] and hence meets the definition of an estimator. Given the likelihood function \[ p(y \mid \mu) = \prod_{n=1}^N \textrm{normal}(y_n \mid \mu, 1), \] the mean is the maximum likelihood estimator,

\[ \textrm{mean}(y) = \textrm{arg max}_{\mu} \ p(y \mid \mu, 1) \] A Bayesian approach to point estimation would be to add a prior and use the posterior mean or median as an estimator. Alternatively, a penalty function could be added to the likelihood so that optimization produces a penalized maximum likelihood estimate. With any of these approaches, the estimator is just a function from data to a number.

In analyzing estimators, the data set is being modeled as a random variable. It is assumed that the observed data is just one of many possible random samples of data that may have been produced. If the data is modeled a random variable, then the estimator applied to the data is also a random variable. The simulations being done for the bootstrap are attempts to randomly sample replicated data sets and compute the random properties of the estimators using standard Monte Carlo methods.

31.1.2 The bootstrap in pseudocode

The bootstrap works by applying an estimator to replicated data sets. These replicates are created by subsampling the original data with replacement. The sample quantiles may then be used to estimate standard errors and confidence intervals.

The following pseudocode estimates 95% confidence intervals and standard errors for a generic estimate \(\hat{\theta}\) that is a function of data \(y\).

for (m in 1:M) {
  y_rep[m] <- sample_uniform(y)
  theta_hat[m] <- estimate_theta(y_rep[m])
}
std_error = sd(theta_hat)
conf_95pct = [ quantile(theta_hat, 0.025),
               quantile(theta_hat, 0.975) ]

The sample_uniform function works by independently assigning each element of y_rep an element of y drawn uniformly at random. This produces a sample with replacement. That is, some elements of y may show up more than once in y_rep and some may not appear at all.