Theory, Examples & Exercises
This is an old revision of the document!
Section: Ordination analysis
Variable selection is a procedure for selecting a subset of explanatory variables from the set of all variables available for constrained ordination (RDA, CCA, db-RDA). The goal is to reduce the number of explanatory variables entering the analysis while keeping the variation explained by them to the maximum. Variable selection is suitable mostly in case of observational studies, where many (often highly intercorrelated) environmental variables are recorded, to reduce their number (and to simplify the story); it is usually not useful for experimental studies with the balanced design of treatment application.
The standard method is forward selection, which is adding explanatory variables one by one; backward selection, in contrary, starts from the full model (with all variables) and deletes variables which the least decreases the total explained variation. Combination of both approaches is stepwise (forward-backward) selection, in which in every step the analysis checks whether some of the already included variables cannot be removed to improve the model.
The simplified sequence of steps in the case of forward selection is the following:
The significance of the variables is one of the possible stopping rules (once the best variable is not significant, the selection is stopped). Alternative stopping rule is reaching the adjusted R2 of the global model (Blanchet et al. 2008): first, calculate adjusted variation explained by all explanatory variables (global model); if during the forward selection the adjusted variation explained by selected variables reaches the R2adj of the global model (with some given precision threshold), the selection will be stopped (available in function
library (vegan) and