Generate and cross-validate models resulting from adding or removing variables and stepwise procedures
stepwise.Rd
step_extend()
combines all models resulting from adding one variable to a base model into a multimodel and subjects it to cv()
.
step_forward()
applies step_extend()
repeatedly, selecting the best model with respect to test error at each step,
thus performing a forward selection of variables.
step_reduce()
combines all models resulting from removing one variable from a full model into a multimodel and subjects it to cv()
.
step_backward()
applies step_reduce()
repeatedly, selecting the best model w.r.t. test error at each step,
thus performing a backward elimination of variables.
best_subset()
combines submodels of the full model in a multimodel and subjects it to cv()
.
The desired range of the model sizes (number of effects) to include is specified in the parameter nvars
.
Usage
step_extend(x, ...)
# S3 method for model
step_extend(
x,
formula1 = null_formula(x),
formula2 = formula(x),
steps = 1L,
include_full = FALSE,
include_base = FALSE,
cv = TRUE,
...
)
# S3 method for default
step_extend(x, ...)
step_forward(x, ...)
# S3 method for model
step_forward(
x,
formula1 = null_formula(x),
formula2 = formula(x),
max_step = 10,
include_base = TRUE,
include_full = FALSE,
nfold = getOption("cv_nfold"),
folds = NULL,
verbose = getOption("cv_verbose"),
...
)
# S3 method for default
step_forward(x, ...)
step_reduce(x, ...)
# S3 method for model
step_reduce(
x,
formula1 = null_formula(x),
formula2 = formula(x),
steps = 1L,
include_full = FALSE,
include_base = FALSE,
cv = TRUE,
...
)
# S3 method for default
step_reduce(x, ...)
step_backward(x, ...)
# S3 method for model
step_backward(
x,
formula1 = null_formula(x),
formula2 = formula(x),
max_step = 10,
include_full = TRUE,
include_base = FALSE,
nfold = getOption("cv_nfold"),
folds = NULL,
verbose = getOption("cv_verbose"),
...
)
# S3 method for default
step_backward(x, ...)
best_subset(x, ...)
# S3 method for model
best_subset(
x,
formula1 = null_formula(x),
formula2 = formula(x),
nvars = 1:5,
include_base = any(nvars == 0),
include_full = FALSE,
cv = TRUE,
...
)
# S3 method for default
best_subset(x, ...)
Arguments
- x
Object of class “model” or a fitted model.
- ...
Dots go to
cv()
instep_extend()
andstep_reduce()
(providedcv=TRUE
), and totune()
instep_forward()
andstep_backward()
.- formula1, formula2
Two nested model formulas defining the range of models to be considered. The larger of the two is taken as the full model, the simpler as the base model. See the “Details” section.
- steps
(
step_extend
,step_reduce
) Integer: Number of variables to add/remove. Default: 1.- include_full
Logical: Whether to include the full model in the output.
- include_base
Logical: Whether to include the base model in the output.
- cv
(
step_extend
,step_reduce
,best_subset
) Logical: Runcv
or just return the multimodel?- max_step
(
step_forward
,step_backward
) Integer: Maximal number of steps.- nfold, folds
Passed to
make_folds
.- verbose
Logical: Output information on execution progress in console?
- nvars
(
best_subset
) Integer vector defining the number of variables.
Value
All of these functions return an object of class “cv”."
Details
formula1
formula2
must be nested model formulas, i.e. one of the two formulas must include all terms present in the other.
They define the range of models to be considered: The larger of the two defines the full model, the other is taken as the base model.
By default, formula1
and formula2
are used to update the original model formula.
Enclose a formula in I()
to replace the model's formula.
This distinction is relevant whenever you specify a formula including a dot.
See the “Details” section and examples in ?update.model
.
See also
multimodel()
, update.model()
, null_formula()
,
cv()
, tune()
Examples
mod <- model(lm(Sepal.Length ~ ., iris),
label = "sepLen")
# Add variables to base model
oneVarModels <- step_extend(mod)
cv_performance(oneVarModels)
#> --- Performance table ---
#> Metric: rmse
#> formula train_rmse test_rmse time_cv
#> +Sepal.Width Sepal.Length ~ Sepal.Width 0.81906 0.82587 0.016
#> +Petal.Length Sepal.Length ~ Petal.Length 0.40397 0.40608 0.009
#> +Petal.Width Sepal.Length ~ Petal.Width 0.47445 0.47636 0.009
#> +Species Sepal.Length ~ Species 0.50881 0.51965 0.013
# step_forwamrd
cv_fwd <- step_forward(mod)
cv_performance(cv_fwd)
#> --- Performance table ---
#> Metric: rmse
#> formula train_rmse test_rmse time_cv
#> base Sepal.Length ~ 1 0.82493 0.82184 0.008
#> +Petal.Length Sepal.Length ~ Petal.Length 0.40397 0.39934 0.009
#> +Sepal.Width Sepal.Length ~ Petal.Length + Sepal.Width 0.32950 0.32767 0.010
#> +Species Sepal.Length ~ Petal.Length + Sepal.Width + Species 0.30461 0.30610 0.016
#> +Petal.Width Sepal.Length ~ Petal.Length + Sepal.Width + Species + Petal.Width 0.30003 0.30266 0.019
# Remove variables from full model
mod |> step_reduce() |> cv_performance()
#> --- Performance table ---
#> Metric: rmse
#> formula train_rmse test_rmse time_cv
#> -Sepal.Width Sepal.Length ~ Petal.Length + Petal.Width + Species 0.33294 0.34157 0.015
#> -Petal.Length Sepal.Length ~ Sepal.Width + Petal.Width + Species 0.42566 0.43186 0.015
#> -Petal.Width Sepal.Length ~ Sepal.Width + Petal.Length + Species 0.30451 0.31367 0.015
#> -Species Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width 0.30977 0.31694 0.011
mod |> step_backward() |> cv_performance()
#> --- Performance table ---
#> Metric: rmse
#> formula train_rmse test_rmse time_cv
#> full Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species 0.29985 0.30990 0.016
#> -Petal.Width Sepal.Length ~ Sepal.Width + Petal.Length + Species 0.30441 0.31375 0.015
#> -Species Sepal.Length ~ Sepal.Width + Petal.Length 0.32955 0.33093 0.009
#> -Sepal.Width Sepal.Length ~ Petal.Length 0.40403 0.39967 0.009
#> -Petal.Length Sepal.Length ~ 1 0.82512 0.82060 0.007
# best subset
mod |> best_subset(nvar = 2:3) |> cv_performance()
#> --- Performance table ---
#> Metric: rmse
#> formula train_rmse test_rmse time_cv
#> +Sepal.Width+Petal.Length Sepal.Length ~ Sepal.Width + Petal.Length 0.32947 0.33340 0.010
#> +Sepal.Width+Petal.Width Sepal.Length ~ Sepal.Width + Petal.Width 0.44595 0.44860 0.010
#> +Sepal.Width+Species Sepal.Length ~ Sepal.Width + Species 0.43110 0.43481 0.015
#> +Petal.Length+Petal.Width Sepal.Length ~ Petal.Length + Petal.Width 0.39830 0.40975 0.010
#> +Petal.Length+Species Sepal.Length ~ Petal.Length + Species 0.33272 0.34273 0.014
#> +Petal.Width+Species Sepal.Length ~ Petal.Width + Species 0.47344 0.48899 0.014
#> +Sepal.Width+Petal.Length+Petal.Width Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width 0.30957 0.31965 0.011
#> +Sepal.Width+Petal.Length+Species Sepal.Length ~ Sepal.Width + Petal.Length + Species 0.30441 0.31458 0.015
#> +Sepal.Width+Petal.Width+Species Sepal.Length ~ Sepal.Width + Petal.Width + Species 0.42568 0.43661 0.015
#> +Petal.Length+Petal.Width+Species Sepal.Length ~ Petal.Length + Petal.Width + Species 0.33257 0.34553 0.015