Extract set, and expand preference criteria for a “cv” object for iteratively fitted models

set_pref_iter() modifies a model or cross-validated model (usually an iteratively fitted model/ifm) by altering its preferences with respect to iterations, without re-fitting the model or re-running the cross-validation, respectively. This is a generic function and its action varies depending on the class of the object being modified. See the “Details” section below.

extract_pref_iter() extracts the information on preferred iterations from a “cv” object.

expand_pref_iter() converts a “cv” object with multiple preferred iterations to a “cv” object having several (identical) models, but different preferred iterations.

Usage

set_pref_iter(x, iter, ...)

# S3 method for model
set_pref_iter(x, ...)

# S3 method for model_fm_xgb
set_pref_iter(x, iter, verbose = TRUE, warn = TRUE, ...)

# S3 method for model_fm_glmnet
set_pref_iter(x, iter, lambda = NULL, verbose = TRUE, warn = TRUE, ...)

# S3 method for cv
set_pref_iter(
  x,
  iter,
  which = label.cv(x),
  label = label.cv(x),
  keep_all = TRUE,
  ...
)

extract_pref_iter(x, ...)

expand_pref_iter(x, iter = NULL, which = label.cv(x))

Arguments

x: A cross-validated model, of class “cv”, usually based on an iteratively fitted model (IFM, see ifm). In set_pref_iter(), x can also be a model or a fitted model.
iter: Usually an integer value. If x is a “cv”, it can also be a preference criterion (see crit_iter) -- multiple criteria are not allowed. If x is a “model” and has cross-validation information, iter can be omitted and the preferred iteration from cross-validation will then be selected.
...: Arguments passed to methods.
verbose: Logical: Show information on modification of arguments in the model generating call?
warn: Logical: Whether to issue a warning if the required information on preferred iterations is not available.
lambda: Vector lambda (of decreasing numeric values) to pass to glmnet().
which: Character, integer or logical vector specifying the cross-validated models to be modified or expanded.
label: label of output object.
keep_all: Logical: If TRUE, all preference criteria are kept, if FALSE, only the selected one. keep_all is relevant only if there is more than one preference criterion.

Value

A modified object of the same class as x.

Details

What set_pref_iter() does varies between classes:

If x is a fitted model, iter is a required argument and must be a positive integer. The pref_iter component in the fitted object is set equal to iter, and the call is adjusted. The adjustment of the call consists in setting pref_iter and making more changes that vary between model classes. If the actual model is a “fm_xgb”, nrounds and early_stopping_rounds are adjusted, in case of a “fm_glmnet”, the value of gamma is changed. These changes are such that
- predict() will return predictions from the iterth iteration and
- execution of the adjusted call generates a model that shares this behavior and stops the fitting process immediately after that iteration.
Note that the model is not re-fitted, and that the ‘fitted’ model returned by set_pref_iter() is in some sense improper, because the object is not identical with what would result from executing the call stored in that model. However, the resulting predictions would be the same.
If x is an object of class “model”, its call element is adjusted, exactly as described above for fitted models. If x has a model fit attached to it (i.e. if has_fit(x) returns TRUE), this fitted model is adjusted, too. If x is a model but not an IFM, then it is returned unchanged.
If x is a cross-validated model (class “cv”), the information about preferred iterations (stored as a part of the component extras) is modified and the “model” is adapted, too.

There is currently no method set_pref_iter.multimodel().

Examples

d <- simuldat(n = 5000)
m <- model("fm_xgb", Y ~ ., d, nrounds = 200, class = "fm_xgb", 
           label = "xgb")
cvm <- cv(m, iter = c(crit_min(), crit_last(), crit_overfit(.5)), 
          nfold = .3)
print(cvm, what = "call")           
#> --- A “cv” object containing 1 validated model ---
#> 
#> Validation procedure: Simple Hold-out Validation
#>   Number of obs in data:  5000
#>   Number of test sets:       1
#>   Size of test set:       1500
#>   Size of training set:   3500
#> 
#> Model:
#> 
#> ‘xgb’:
#>   metric:  rmse
#>   call:    fm_xgb(formula = Y ~ ., data = data, nrounds = 200)
#> 
#> Preferred iterations:
#>   model ‘xgb’:  min (iter=67), last (iter=77), 
#>                 overfit0.5 (iter=35)
extract_pref_iter(cvm)
#> Preferred iterations:
#>   model ‘xgb’:  min (iter=67), last (iter=77), 
#>                 overfit0.5 (iter=35)
cv_performance(cvm)
#> --- Performance table ---
#> Metric: rmse
#>     train_rmse test_rmse iteration time_cv
#> xgb     0.5323    1.4749        67   0.378
plot(evaluation_log(cvm))


# set_pref_iter
cvm_last <- set_pref_iter(cvm, crit_last(), label = "xgb_last")
cv_performance(c(cvm, cvm_last))
#> --- Performance table ---
#> Metric: rmse
#>          train_rmse test_rmse iteration time_cv
#> xgb         0.53230    1.4749        67   0.378
#> xgb_last    0.46186    1.4763        77   0.378

# expand_pref_iter
cvm_expanded <- expand_pref_iter(cvm)
print(cvm_expanded, what = c("call"))
#> --- A “cv” object containing 3 validated models ---
#> 
#> Validation procedure: Simple Hold-out Validation
#>   Number of obs in data:  5000
#>   Number of test sets:       1
#>   Size of test set:       1500
#>   Size of training set:   3500
#> 
#> Models:
#> 
#> ‘xgb_min’:
#>   metric:  rmse
#>   call:    fm_xgb(formula = Y ~ ., data = data, nrounds = 200)
#> 
#> ‘xgb_last’:
#>   metric:  rmse
#>   call:    fm_xgb(formula = Y ~ ., data = data, nrounds = 200)
#> 
#> ‘xgb_overfit0.5’:
#>   metric:  rmse
#>   call:    fm_xgb(formula = Y ~ ., data = data, nrounds = 200)
#> 
#> Preferred iterations:
#>   model ‘xgb_min’:         min (iter=67)
#>   model ‘xgb_last’:        last (iter=77)
#>   model ‘xgb_overfit0.5’:  overfit0.5 (iter=35)
cv_performance(cvm_expanded)       
#> --- Performance table ---
#> Metric: rmse
#>                train_rmse test_rmse iteration time_cv
#> xgb_min           0.53230    1.4749        67   0.378
#> xgb_last          0.46186    1.4763        77   0.378
#> xgb_overfit0.5    0.75636    1.5010        35   0.378
plot(evaluation_log(cvm_expanded))