The reduce.existing.formula function was designed to perform quality checks and automatic removal of impractical variables can also be accessed when an existing formula has been previously constructed. This method uses natural language processing techniques to deconstruct the components of a formula.

reduce.existing.formula(
  the.initial.formula,
  dat,
  max.input.categories = 20,
  max.outcome.categories.to.search = 4,
  force.main.effects = TRUE,
  order.as = "as.specified",
  include.backtick = "as.needed",
  format.as = "formula",
  envir = .GlobalEnv
)

Arguments

the.initial.formula

is an object of class "formula" or "character" that states the inputs and output in the form y ~ x1 + x2.

dat

Data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model.

max.input.categories

Limits the maximum number of variables that will be employed in the formula.As default it is set at 20, but users can still change at his/her convenience.

max.outcome.categories.to.search

A numeric value. The create.formula function es a feature that identifies input variables exhibiting a lack of contrast. When reduce = TRUE, these variables are automatically excluded from the resulting formula. This search may be expanded to subsets of the outcome when the number of unique measured values of the outcome is no greater than max.outcome.categories.to.search. In this case, each subset of the outcome will be separately examined, and any inputs builthat exhibit a lack of contrast within at least one subset will be excluded.

force.main.effects

This is a logical value. When TRUE, the intent is that any term ed as an interaction (of multiple variables) must also be listed individually as a main effect.

order.as

rearranges its first argument into ascending or descending order.

include.backtick

Add backticks to make a appropriate variable

format.as

The data type of the output. If not set as "formula", then a character vector will be returned.

envir

The path to search. Global environment is default value

Examples

data('snack.dat') the.initial.formula <- 'Income ~ .' reduce.existing.formula(the.initial.formula = the.initial.formula,dat = snack.dat, max.input.categories = 30)$formula
#> Warning: NAs introduced by coercion
#> Income ~ Age + Gender + Region + Persona + Product + Awareness + #> BP_For_Me_0_10 + BP_Fits_Budget_0_10 + BP_Tastes_Great_0_10 + #> BP_Good_To_Share_0_10 + BP_Like_Logo_0_10 + BP_Special_Occasions_0_10 + #> BP_Everyday_Snack_0_10 + BP_Healthy_0_10 + BP_Delicious_0_10 + #> BP_Right_Amount_0_10 + BP_Relaxing_0_10 + Consideration + #> Consumption + Satisfaction + Advocacy + `Age Group` + `Income Group` #> <environment: 0x0000000017029080>