33.3 Error statistics from the bootstrap
Running the model multiple times produces a Monte Carlo sample of estimates from multiple alternative data sets subsampled from the original data set. The error distribution is just the distribution of the bootstrap estimates minus the estimate for the original data set.
To estimate standard errors and confidence intervals for maximum likelihood estimates the Stan program is executed multiple times using optimization (which turns off Jacobian adjustments for constraints and finds maximum likelihood estimates). On the order of one hundred replicates is typically enough to get a good sense of standard error; more will be needed to accurate estimate the boundaries of a 95% confidence interval. On the other hand, given that there is inherent variance due to sampling the original data \(y\), it is usually not worth calculating bootstrap estimates to high precision.
33.3.1 Standard errors
Here’s the result of calculating standard errors for the linear regression model above with \(N = 50\) data points, \(\alpha = 1.2, \beta = -0.5,\) and \(\sigma = 1.5.\) With a total of \(M = 100\) bootstrap samples, there are 100 estimates of \(\alpha\), 100 of \(\beta\), and 100 of \(\sigma\). These are then treated like Monte Carlo draws. For example, the sample standard deviation of the draws for \(\alpha\) provide the bootstrap estimate of the standard error in the estimate for \(\alpha\). Here’s what it looks like for the above model with \(M = 100\)
parameter estimate std err
--------- -------- -------
alpha 1.359 0.218
beta -0.610 0.204
sigma 1.537 0.142
With the data set fixed, these estimates of standard error will display some Monte Carlo error. For example, here are the standard error estimates from five more runs holding the data the same, but allowing the subsampling to vary within Stan:
parameter estimate std err
--------- -------- -------
alpha 1.359 0.206
alpha 1.359 0.240
alpha 1.359 0.234
alpha 1.359 0.249
alpha 1.359 0.227
Increasing \(M\) will reduce Monte Carlo error, but this is not usually worth the extra computation time as there is so much other uncertainty due to the original data sample \(y\).
33.3.2 Confidence intervals
As usual with Monte Carlo methods, confidence intervals are estimated using quantiles of the draws. That is, if there are \(M = 1000\) estimates of \(\hat{\alpha}\) in different subsamples, the 2.5% quantile and 97.5% quantile pick out the boundaries of the 95% confidence interval around the estimate for the actual data set \(y\). To get accurate 97.5% quantile estimates requires a much larger number of Monte Carlo simulations (roughly twenty times as large as needed for the median).