Smooth the tail draws of x by replacing tail draws by order statistics of a generalized Pareto distribution fit to the tail(s). For further details see Vehtari et al. (2024).
pareto_smooth(x, ...)
# S3 method for rvar
pareto_smooth(x, return_k = FALSE, extra_diags = FALSE, ...)
# S3 method for default
pareto_smooth(
x,
tail = c("both", "right", "left"),
r_eff = NULL,
ndraws_tail = NULL,
return_k = FALSE,
extra_diags = FALSE,
verbose = TRUE,
are_log_weights = FALSE,
...
)
(multiple options) One of:
A matrix of draws for a single variable (iterations x chains). See
extract_variable_matrix()
.
An rvar
.
Arguments passed to individual methods (if applicable).
(logical) Should the Pareto khat be included in
output? If TRUE
, output will be a list containing smoothed
draws and diagnostics, otherwise it will be a numeric of the
smoothed draws. Default is FALSE
.
(logical) Should extra Pareto khat diagnostics
be included in output? If TRUE
, min_ss
, khat_threshold
and
convergence_rate
for the estimated k value will be
returned. Default is FALSE
.
(string) The tail to diagnose/smooth:
"right"
: diagnose/smooth only the right (upper) tail
"left"
: diagnose/smooth only the left (lower) tail
"both"
: diagnose/smooth both tails and return the maximum k-hat value
The default is "both"
.
(numeric) relative effective sample size estimate. If
r_eff
is NULL, it will be calculated assuming the draws are
from MCMC. Default is NULL.
(numeric) number of draws for the tail. If
ndraws_tail
is not specified, it will be calculated as
ceiling(3 * sqrt(length(x) / r_eff)) if length(x) > 225 and
length(x) / 5 otherwise (see Appendix H in Vehtari et
al. (2024)).
(logical) Should diagnostic messages be printed? If
TRUE
, messages related to Pareto diagnostics will be
printed. Default is FALSE
.
(logical) Are the draws log weights? Default is
FALSE
. If TRUE
computation will take into account that the
draws are log weights, and only right tail will be smoothed.
Either a vector x
of smoothed values or a named list
containing the vector x
and a named list diagnostics
containing numeric values:
khat
: estimated Pareto k shape parameter, and optionally
min_ss
: minimum sample size for reliable Pareto smoothed
estimate
khat_threshold
: sample size specific khat threshold for
reliable Pareto smoothed estimates
convergence_rate
: Relative convergence rate for
Pareto smoothed estimates
If any of the draws is non-finite, that is, NA
, NaN
, Inf
, or
-Inf
, Pareto smoothing will not be performed, and the original
draws will be returned and and diagnostics will be NA
(numeric).
Aki Vehtari, Daniel Simpson, Andrew Gelman, Yuling Yao and Jonah Gabry (2024). Pareto Smoothed Importance Sampling. Journal of Machine Learning Research, 25(72):1-58. PDF
pareto_khat
for only calculating khat, and
pareto_diags
for additional diagnostics.
mu <- extract_variable_matrix(example_draws(), "mu")
pareto_smooth(mu)
#> Pareto k-hat = 0.2.
#> chain
#> iteration 1 2 3 4
#> 1 2.0058311 2.99038071 1.79436801 6.45897880
#> 2 1.4583161 8.07284940 5.98637117 9.27304630
#> 3 5.8149473 -1.33971154 2.55720201 0.30130121
#> 4 6.8495862 11.42146271 2.79442522 3.69252884
#> 5 1.8051677 10.10811381 0.11457536 5.48027067
#> 6 3.8412425 -10.90481295 1.06361700 2.37951413
#> 7 5.4742729 -7.67893226 3.67385716 12.37474175
#> 8 1.2030617 1.79105196 3.50583836 4.90456023
#> 9 0.1923430 5.35025400 8.86764324 0.87963558
#> 10 7.1729098 0.86185525 8.92963703 3.81374732
#> 11 0.9910314 10.22878590 1.89644594 3.43243605
#> 12 -1.5650982 7.17842051 4.26930915 -5.18091530
#> 13 0.7175292 7.30390277 0.49592463 -4.51701690
#> 14 5.4475057 0.22963730 -0.30118385 7.38133431
#> 15 6.1805031 7.87716434 0.55494635 2.21244092
#> 16 8.1143336 13.71124994 6.10796214 6.09010450
#> 17 7.4109742 -4.01010957 6.13616693 8.63747217
#> 18 5.0653065 9.34967709 0.93917231 10.81154206
#> 19 6.5950099 7.33764114 -0.24854260 2.29580552
#> 20 -6.1230828 -0.41253795 4.01895927 8.03219045
#> 21 10.9923672 4.82695606 5.36039410 6.59446197
#> 22 8.7492153 -0.35580209 0.63877913 4.77379635
#> 23 8.4807964 0.83891018 2.18232401 -1.69043575
#> 24 9.5127659 0.33576205 -0.47154715 8.38276275
#> 25 1.3686009 8.43118203 2.37151659 1.96883059
#> 26 6.3283157 9.59984473 1.99824582 5.52765102
#> 27 7.8038181 0.26594097 3.00600558 6.10060909
#> 28 5.5044978 3.52617924 2.61449547 5.13685962
#> 29 7.1543373 6.63504133 4.05403279 6.26798609
#> 30 7.1543373 4.87313661 4.72705540 6.21822025
#> 31 4.4980414 5.74947888 2.07701271 2.24644098
#> 32 3.9728593 -0.05536469 1.71709968 1.01563071
#> 33 1.2466721 9.69110598 5.67656775 4.37682520
#> 34 6.7154469 1.25631396 7.35210528 -0.14869286
#> 35 2.5934765 4.60053327 3.34758639 6.85536829
#> 36 4.1509232 2.62214416 3.46540535 0.79173490
#> 37 5.5769881 7.08964546 7.56578659 7.31015271
#> 38 6.4795882 3.27130941 1.07576229 0.69180892
#> 39 6.7149149 10.49687754 0.87317333 3.45283097
#> 40 3.0928339 -2.97728532 8.28927887 6.15700804
#> 41 4.6676343 5.77760488 9.19934064 6.94590217
#> 42 0.4341460 3.34594153 7.76809799 7.69845246
#> 43 7.6644860 4.58004814 3.64745811 5.71361755
#> 44 2.7710241 6.25215016 4.02172175 1.09474532
#> 45 -2.5071234 3.03653496 5.59027281 4.14183957
#> 46 3.6239761 6.37787278 3.77182163 4.65610892
#> 47 4.1775368 -1.14178246 6.76726892 4.83530335
#> 48 5.1891338 10.35792448 6.73120973 2.36025147
#> 49 3.7007374 2.26215093 -2.13372165 5.18949994
#> 50 7.9148389 1.94726080 -0.80750891 6.96445721
#> 51 2.5890314 9.05983209 -3.26518724 0.76747482
#> 52 2.4826160 6.45144266 8.99365227 4.92198428
#> 53 0.9075757 4.76285987 9.99482330 4.12501340
#> 54 0.3693646 -2.72735886 4.85299318 4.38195376
#> 55 7.9532159 8.19992395 0.03218316 0.58349237
#> 56 2.3329707 6.41479763 5.44475101 3.08923892
#> 57 4.3324681 6.06940520 6.49611407 5.61329746
#> 58 1.7299121 3.43035241 2.08284445 5.80087096
#> 59 1.4196226 4.86096716 3.42615485 2.68913918
#> 60 6.5778662 6.71519862 3.86646326 4.23484000
#> 61 11.6837230 7.27580476 1.89767270 0.07398856
#> 62 -2.3106878 6.42851118 6.78704544 6.75807164
#> 63 -1.9729536 3.24236645 3.67681516 5.35730452
#> 64 13.6001013 9.42948716 2.80134400 3.36597040
#> 65 -3.6032069 -0.96573696 1.55813764 3.61442662
#> 66 12.8720357 8.69255780 5.07969585 3.35760729
#> 67 0.8886184 3.71978511 5.33168512 3.62716863
#> 68 0.9028057 5.74734105 4.05880826 0.81554158
#> 69 1.6773174 6.77865339 6.43319154 7.19935256
#> 70 7.7329831 8.80754168 6.14945199 -0.19775088
#> 71 9.7869937 0.46539458 6.75459412 1.54498835
#> 72 5.9190529 9.88802789 5.75546387 8.58386979
#> 73 2.7463754 -1.82585563 4.77007413 0.66556551
#> 74 6.4792428 11.19379834 4.44370854 8.24410926
#> 75 2.7683627 2.96836015 6.27212042 0.99399372
#> 76 7.4410372 7.01374501 3.15152477 2.54345797
#> 77 2.6677480 3.66672487 4.95460833 7.21851228
#> 78 5.5287068 1.41078743 5.29437266 5.12487891
#> 79 11.9938289 6.58536629 3.50990665 2.41520814
#> 80 -0.8846060 3.45698706 -0.53300259 3.81524258
#> 81 -0.1012630 3.94222115 -0.59709705 5.15624656
#> 82 5.2762736 5.83074377 -0.66404619 3.74296853
#> 83 -0.0109097 7.50248450 -0.73409233 0.52576589
#> 84 1.7616762 -1.23770609 1.00657412 7.99232314
#> 85 -1.0513068 7.00985107 3.14241272 7.84016569
#> 86 5.3275047 1.24852261 6.64058798 6.18030063
#> 87 6.3287644 1.86987051 4.78821655 5.81531118
#> 88 2.4525260 5.82211954 2.89403878 3.15575203
#> 89 3.7334839 3.37948604 9.12833565 3.22457344
#> 90 2.7919003 5.58737855 0.40214725 5.44679063
#> 91 6.1234455 5.89213877 6.38032817 1.06898827
#> 92 3.6517885 10.64734978 7.47153612 6.69428679
#> 93 5.4253306 3.95788604 8.33547985 5.16954115
#> 94 4.8155242 2.66672890 6.31708788 0.97997759
#> 95 3.5079380 7.59817058 1.33994762 5.68731206
#> 96 6.7751655 0.74274524 7.63106467 3.28055582
#> 97 1.8567982 8.15667906 3.92688491 5.03958552
#> 98 6.1728283 1.51990114 0.15400719 2.72636415
#> 99 1.5485347 8.53166943 3.17470254 0.61142881
#> 100 7.5338964 -1.44854601 3.57555760 7.04723309
d <- as_draws_rvars(example_draws("multi_normal"))
pareto_smooth(d$Sigma)
#> Pareto k-hat = 0.06.
#> Pareto k-hat = 0.04.
#> Pareto k-hat = 0.05.
#> Pareto k-hat = 0.04.
#> Pareto k-hat = 0.1.
#> Pareto k-hat = 0.06.
#> Pareto k-hat = 0.05.
#> Pareto k-hat = 0.06.
#> Pareto k-hat = -0.08.
#> rvar<100,4>[3,3] mean ± sd:
#> [,1] [,2] [,3]
#> [1,] 1.28 ± 0.17 0.53 ± 0.21 -0.40 ± 0.29
#> [2,] 0.53 ± 0.21 3.66 ± 0.45 -2.10 ± 0.49
#> [3,] -0.40 ± 0.29 -2.10 ± 0.49 8.12 ± 0.96