Perform cross-validation for the projective variable selection for a generalized linear model.

cv_varsel(fit, method = NULL, cv_method = NULL, ns = NULL,
  nc = NULL, nspred = NULL, ncpred = NULL, relax = NULL,
  nv_max = NULL, intercept = NULL, penalty = NULL, verbose = T,
  nloo = NULL, K = NULL, lambda_min_ratio = 1e-05, nlambda = 150,
  thresh = 1e-06, regul = 1e-04, validate_search = T, seed = NULL,



Same as in varsel.


Same as in varsel.


The cross-validation method, either 'LOO' or 'kfold'. Default is 'LOO'.


Number of samples used for selection. Ignored if nc is provided or if method='L1'.


Number of clusters used for selection. Default is 1 and ignored if method='L1' (L1-search uses always one cluster).


Number of samples used for prediction (after selection). Ignored if ncpred is given.


Number of clusters used for prediction (after selection). Default is 5.


Same as in varsel.


Same as in varsel.


Same as in varsel.


Same as in varsel.


Whether to print out some information during the validation, Default is TRUE.


Number of observations used to compute the LOO validation (anything between 1 and the total number of observations). Smaller values lead to faster computation but higher uncertainty (larger errorbars) in the accuracy estimation. Default is to use all observations, but for faster experimentation, one can set this to a small value such as 100. Only applicable if cv_method = LOO.


Number of folds in the k-fold cross validation. Only applicable if cv_method = TRUE and k_fold = NULL.


Same as in varsel.


Same as in varsel.


Same as in varsel.


Amount of regularization in the projection. Usually there is no need for regularization, but sometimes for some models the projection can be ill-behaved and we need to add some regularization to avoid numerical problems.


Whether to cross-validate also the selection process, that is, whether to perform selection separately for each fold. Default is TRUE and we strongly recommend not setting this to FALSE, because this is known to bias the accuracy estimates for the selected submodels. However, setting this to FALSE can sometimes be useful because comparing the results to the case where this parameter is TRUE gives idea how strongly the feature selection is (over)fitted to the data (the difference corresponds to the search degrees of freedom or the effective number of parameters introduced by the selectin process).


Random seed used in the subsampling LOO. By default uses a fixed seed.


Additional arguments to be passed to the get_refmodel-function.


An object of type cvsel that contains information about the feature selection. The fields are not meant to be accessed directly by the user but instead via the helper functions (see the vignettes or type ?projpred to see the main functions in the package.)


### Usage with stanreg objects fit <- stan_glm(y~x, binomial())
#> Error in stan_glm(y ~ x, binomial()): could not find function "stan_glm"
cvs <- cv_varsel(fit)
#> Error in get_refmodel(fit, ...): object 'fit' not found
#> Error in "vsel" %in% class(object): object 'cvs' not found