This function can be used for suggesting an appropriate model size
based on a certain default rule. Notice that the decision rules are heuristic
and should be interpreted as guidelines. It is recommended that the user
studies the results via
and makes the final decision based on what is most appropriate for the given
suggest_size(object, stat = "elpd", alpha = 0.32, pct = 0, type = "upper", baseline = NULL, warnings = TRUE, ...)
Statistic used for the decision. Default is 'elpd'. See
A number indicating the desired coverage of the credible
intervals based on which the decision is made. E.g.
Number indicating the relative proportion between baseline model and null model utilities one is willing to sacrifice. See details for more information.
Either 'upper' (default) or 'lower' determining whether the decisions are based on the upper or lower credible bounds. See details for more information.
Either 'ref' or 'best' indicating whether the baseline is the reference model or the best submodel found. Default is 'ref' when the reference model exists, and 'best' otherwise.
Whether to give warnings if automatic suggestion fails, mainly for internal use. Default is TRUE, and usually no reason to set to FALSE.
The suggested model size is the smallest model for which
either the lower or upper (depending on argument
type) credible bound
of the submodel utility \(u_k\) with significance level
alpha falls above
$$u_base - pct*(u_base - u_0)$$
Here \(u_base\) denotes the utility for the baseline model and \(u_0\) the null model utility.
The baseline is either the reference model or the best submodel found (see argument
The lower and upper bounds are defined to contain the submodel utility with
probability 1-alpha (each tail has mass alpha/2).
type='upper' which means that we select the smallest
model for which the upper tail exceeds the baseline model level, that is, which is better than the baseline
model with probability 0.16 (and consequently, worse with probability 0.84). In other words,
the estimated difference between the baseline model and submodel utilities is at most one standard error
away from zero, so the two utilities are considered to be close.
NOTE: Loss statistics like RMSE and MSE are converted to utilities by multiplying them by -1, so call
suggest_size(object, stat='rmse', type='upper') should be interpreted as finding
the smallest model whose upper credible bound of the negative RMSE exceeds the cutoff level
(or equivalently has the lower credible bound of RMSE below the cutoff level). This is done to make
the interpretation of the argument
type the same regardless of argument
### Usage with stanreg objects fit <- stan_glm(y~x, binomial())#> Error in stan_glm(y ~ x, binomial()): could not find function "stan_glm"vs <- cv_varsel(fit)#> Error in get_refmodel(fit, ...): object 'fit' not foundsuggest_size(vs)#> Error in "vsel" %in% class(object): object 'vs' not found