The reduce.existing.formula function was designed to perform quality checks and automatic removal of impractical variables can also be accessed when an existing formula has been previously constructed. This method uses natural language processing techniques to deconstruct the components of a formula.
reduce.existing.formula( the.initial.formula, dat, max.input.categories = 20, max.outcome.categories.to.search = 4, force.main.effects = TRUE, order.as = "as.specified", include.backtick = "as.needed", format.as = "formula", envir = .GlobalEnv )
the.initial.formula | is an object of class "formula" or "character" that states the inputs and output in the form y ~ x1 + x2. |
---|---|
dat | Data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. |
max.input.categories | Limits the maximum number of variables that will be employed in the formula.As default it is set at 20, but users can still change at his/her convenience. |
max.outcome.categories.to.search | A numeric value. The create.formula function es a feature that identifies input variables exhibiting a lack of contrast. When reduce = TRUE, these variables are automatically excluded from the resulting formula. This search may be expanded to subsets of the outcome when the number of unique measured values of the outcome is no greater than max.outcome.categories.to.search. In this case, each subset of the outcome will be separately examined, and any inputs builthat exhibit a lack of contrast within at least one subset will be excluded. |
force.main.effects | This is a logical value. When TRUE, the intent is that any term ed as an interaction (of multiple variables) must also be listed individually as a main effect. |
order.as | rearranges its first argument into ascending or descending order. |
include.backtick | Add backticks to make a appropriate variable |
format.as | The data type of the output. If not set as "formula", then a character vector will be returned. |
envir | The path to search. Global environment is default value |
data('snack.dat') the.initial.formula <- 'Income ~ .' reduce.existing.formula(the.initial.formula = the.initial.formula,dat = snack.dat, max.input.categories = 30)$formula#> Warning: NAs introduced by coercion#> Income ~ Age + Gender + Region + Persona + Product + Awareness + #> BP_For_Me_0_10 + BP_Fits_Budget_0_10 + BP_Tastes_Great_0_10 + #> BP_Good_To_Share_0_10 + BP_Like_Logo_0_10 + BP_Special_Occasions_0_10 + #> BP_Everyday_Snack_0_10 + BP_Healthy_0_10 + BP_Delicious_0_10 + #> BP_Right_Amount_0_10 + BP_Relaxing_0_10 + Consideration + #> Consumption + Satisfaction + Advocacy + `Age Group` + `Income Group` #> <environment: 0x0000000017029080>