The E_loo()
function computes weighted expectations (means, variances,
quantiles) using the importance weights obtained from the
PSIS smoothing procedure. The expectations estimated by the
E_loo()
function assume that the PSIS approximation is working well.
A small Pareto k estimate is necessary,
but not sufficient, for E_loo()
to give reliable estimates. Additional
diagnostic checks for gauging the reliability of the estimates are in
development and will be added in a future release.
A numeric vector or matrix.
An object returned by psis()
.
Arguments passed to individual methods.
The type of expectation to compute. The options are
"mean"
, "variance"
, "sd"
, and "quantile"
.
For computing quantiles, a vector of probabilities.
Optionally, a vector or matrix (the same dimensions as x
)
of raw (not smoothed) log ratios. If working with log-likelihood values,
the log ratios are the negative of those values. If log_ratios
is
specified we are able to compute more accurate Pareto k
diagnostics specific to E_loo()
.
A named list with the following components:
value
The result of the computation.
For the matrix method, value
is a vector with ncol(x)
elements, with one exception: when type="quantile"
and
multiple values are specified in probs
the value
component of
the returned object is a length(probs)
by ncol(x)
matrix.
For the default/vector method the value
component is scalar, with
one exception: when type="quantile"
and multiple values
are specified in probs
the value
component is a vector with
length(probs)
elements.
pareto_k
Function-specific diagnostic.
For the matrix method it will be a vector of length ncol(x)
containing estimates of the shape parameter \(k\) of the
generalized Pareto distribution. For the default/vector method,
the estimate is a scalar. If log_ratios
is not specified when
calling E_loo()
, the smoothed log-weights are used to estimate
Pareto-k's, which may produce optimistic estimates.
For type="mean"
, type="var"
, and type="sd"
, the returned Pareto-k is
the maximum of the Pareto-k's for the left and right tail of \(hr\) and
the right tail of \(r\), where \(r\) is the importance ratio and
\(h=x\) for type="mean"
and \(h=x^2\) for type="var"
and
type="sd"
. For type="quantile"
, the returned Pareto-k is the Pareto-k
for the right tail of \(r\).
# \donttest{
# Use rstanarm package to quickly fit a model and get both a log-likelihood
# matrix and draws from the posterior predictive distribution
library("rstanarm")
#> Loading required package: Rcpp
#> This is rstanarm version 2.32.1
#> - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
#> - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
#> - For execution on a local, multicore CPU with excess RAM we recommend calling
#> options(mc.cores = parallel::detectCores())
# data from help("lm")
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
d <- data.frame(
weight = c(ctl, trt),
group = gl(2, 10, 20, labels = c("Ctl","Trt"))
)
fit <- stan_glm(weight ~ group, data = d, refresh = 0)
yrep <- posterior_predict(fit)
dim(yrep)
#> [1] 4000 20
log_ratios <- -1 * log_lik(fit)
dim(log_ratios)
#> [1] 4000 20
r_eff <- relative_eff(exp(-log_ratios), chain_id = rep(1:4, each = 1000))
psis_object <- psis(log_ratios, r_eff = r_eff, cores = 2)
E_loo(yrep, psis_object, type = "mean")
#> Error in is_constant(x_i): could not find function "is_constant"
E_loo(yrep, psis_object, type = "var")
#> Error in is_constant(x_i): could not find function "is_constant"
E_loo(yrep, psis_object, type = "sd")
#> Error in is_constant(x_i): could not find function "is_constant"
E_loo(yrep, psis_object, type = "quantile", probs = 0.5) # median
#> $value
#> [1] 5.143318 4.990419 5.045964 4.893522 5.078653 5.088694 4.995732 5.106827
#> [9] 4.985569 5.044546 4.638359 4.719482 4.693008 4.770725 4.530231 4.727474
#> [17] 4.532588 4.644159 4.692495 4.642926
#>
#> $pareto_k
#> [1] 0.15656938 0.21659000 0.13761689 0.30380728 0.15870293 0.13844575
#> [7] 0.13090888 0.13384083 0.20959799 0.11930881 0.18298672 0.21539893
#> [13] 0.07619255 0.26787598 0.33112990 0.27326698 0.42694305 0.16502194
#> [19] 0.11863410 0.12681907
#>
E_loo(yrep, psis_object, type = "quantile", probs = c(0.1, 0.9))
#> $value
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 4.202947 4.024490 4.069320 3.975519 4.072362 4.100821 3.999408 4.086386
#> [2,] 6.058854 5.971274 6.083626 5.812561 6.065001 6.079247 5.974741 6.080411
#> [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
#> [1,] 3.974495 3.985170 3.685336 3.742536 3.720220 3.838895 3.661197 3.758993
#> [2,] 6.000139 6.057105 5.667853 5.731417 5.646776 5.683174 5.429682 5.687527
#> [,17] [,18] [,19] [,20]
#> [1,] 3.652822 3.657359 3.702788 3.663643
#> [2,] 5.394626 5.655873 5.676459 5.635945
#>
#> $pareto_k
#> [1] 0.15656938 0.21659000 0.13761689 0.30380728 0.15870293 0.13844575
#> [7] 0.13090888 0.13384083 0.20959799 0.11930881 0.18298672 0.21539893
#> [13] 0.07619255 0.26787598 0.33112990 0.27326698 0.42694305 0.16502194
#> [19] 0.11863410 0.12681907
#>
# We can get more accurate Pareto k diagnostic if we also provide
# the log_ratios argument
E_loo(yrep, psis_object, type = "mean", log_ratios = log_ratios)
#> Error in is_constant(x_i): could not find function "is_constant"
# }