Estimate Pareto k value by fitting a Generalized Pareto Distribution to one or two tails of x. This can be used to estimate the number of fractional moments that is useful for convergence diagnostics. For further details see Vehtari et al. (2024).
Usage
pareto_khat(x, ...)
# Default S3 method
pareto_khat(
x,
tail = c("both", "right", "left"),
r_eff = NULL,
ndraws_tail = NULL,
verbose = FALSE,
are_log_weights = FALSE,
...
)
# S3 method for class 'rvar'
pareto_khat(x, ...)Arguments
- x
(multiple options) One of:
A matrix of draws for a single variable (iterations x chains). See
extract_variable_matrix().An
rvar.
- ...
Arguments passed to individual methods (if applicable).
- tail
(string) The tail to diagnose/smooth:
"right": diagnose/smooth only the right (upper) tail"left": diagnose/smooth only the left (lower) tail"both": diagnose/smooth both tails and return the maximum k-hat value
The default is
"both".- r_eff
(numeric) relative effective sample size estimate. If
r_effis NULL, it will be calculated assuming the draws are from MCMC. Default is NULL.- ndraws_tail
(numeric) number of draws for the tail. If
ndraws_tailis not specified, it will be calculated as ceiling(3 * sqrt(length(x) / r_eff)) if length(x) > 225 and length(x) / 5 otherwise (see Appendix H in Vehtari et al. (2024)).- verbose
(logical) Should diagnostic messages be printed? If
TRUE, messages related to Pareto diagnostics will be printed. Default isFALSE.- are_log_weights
(logical) Are the draws log weights? Default is
FALSE. IfTRUEcomputation will take into account that the draws are log weights, and only right tail will be smoothed.
Value
If the input is an array, returns a single numeric value. If any of the draws
is non-finite, that is, NA, NaN, Inf, or -Inf, the returned output
will be (numeric) NA. Also, if all draws within any of the chains of a
variable are the same (constant), the returned output will be (numeric) NA
as well. The reason for the latter is that, for constant draws, we cannot
distinguish between variables that are supposed to be constant (e.g., a
diagonal element of a correlation matrix is always 1) or variables that just
happened to be constant because of a failure of convergence or other problems
in the sampling process.
If the input is an rvar, returns an array of the same dimensions as the
rvar, where each element is equal to the value that would be returned by
passing the draws array for that element of the rvar to this function.
References
Aki Vehtari, Daniel Simpson, Andrew Gelman, Yuling Yao and Jonah Gabry (2024). Pareto Smoothed Importance Sampling. Journal of Machine Learning Research, 25(72):1-58. PDF
See also
pareto_diags for additional related diagnostics, and
pareto_smooth for Pareto smoothed draws.
Other diagnostics:
ess_basic(),
ess_bulk(),
ess_quantile(),
ess_sd(),
ess_tail(),
mcse_mean(),
mcse_quantile(),
mcse_sd(),
pareto_diags(),
rhat(),
rhat_basic(),
rhat_nested(),
rstar()
Examples
mu <- extract_variable_matrix(example_draws(), "mu")
pareto_khat(mu)
#> [1] 0.1883631
d <- as_draws_rvars(example_draws("multi_normal"))
pareto_khat(d$Sigma)
#> [,1] [,2] [,3]
#> [1,] 0.04795008 0.04397814 0.04538642
#> [2,] 0.04397814 0.08793028 0.07088579
#> [3,] 0.04538642 0.07088579 -0.08599429