Theory, Examples & Exercises
This is an old revision of the document!
Linear constrained ordination methods implicitly based on Euclidean (RDA) or Hellinger/chord/other (tb-RDA) distances. The calculation (detailed below) can be simply described as a set of (multiple) linear regression analyses, where species abundances (for each species in the species composition matrix separately) are regressed against (one or several) environmental variable(s). The result is that variation in species composition is decomposed into variation related to environmental variables (represented by constrained/canonical axes) and not related to environmental variables (unconstrained axes). The number of constrained axes is equal or lower than the number of quantitative explanatory variables; in the case of a qualitative/categorical variable, the number of constrained axes is equal to the number of categories in that variable minus one. Each canonical axis is a linear combination of all explanatory variables.
The algorithm of RDA can be summarised as follows (Fig. 1 and Fig. 2). The matrix of species composition (sample x species) and the matrix of environmental variables (sample x env.variables, for simplicity containing only one env. variable in the illustration below) needs to be available.
While in the case of unconstrained ordination the information we are interested is mostly about the configuration of samples and species in the ordination diagram, the relative importance of individual ordination axes (measured by their eigenvalues) and ecological interpretation of ordination axes, in the case of constrained ordinations we are more interested in the effect of environmental variables on species composition, namely in the amount of variation these variables explain (see the section Explained variation), and whether this variation is significant or not (Monte Carlo permutation test), which of the available environmental variables are important to explain variation of studies community (Forward selection) and how to partition the variation explained by different variables or different sets of variables (Variation partitioning).
Unimodal constrained ordination method, related to correspondence analysis (CA), with an algorithm derived from redundancy analysis (RDA). The algorithm of RDA is modified in the way that instead of raw species composition data, the set of regressions is done on the matrix, and the weighted multiple regression is used instead of simple multiple regression, where weights are row sums, i.e. the sums of species abundances in individual samples. The requirement for input data is the same as for correspondence analysis - the data must be non-negative integers or presences-absences.
Note that CCA calculates two sets of sample scores: LC scores, and WA scores. LC scores are linear combinations of the columns in the environmental matrix, while WA scores are weighted averages of the species scores. Default plotting of ordination diagrams differ between programs; e.g. in R (library vegan), the samples in CCA ordination plots are using WA scores, while in CANOCO 5 they are plotted using LC scores. Use of each scoring method has its proponents and opponents. Some (e.g. ter Braak, one of two CANOCO 5 authors) that LC scores are more meaningful, since they are not influenced by species composition; others (e.g. McCune, author of PC-ORD) that WA scores are better, because they are resistant against the noise in the species composition data. The difference when plotted onto the ordination diagram is rather obvious when explanatory (environmental) variables are factors with several levels, or quantitative variables with evenly spaced values (Fig. 3). Remember to report which scores you have chosen to display, whether LC or WA.
This is RDA applied to the matrix of sample scores calculated by principal coordinate analysis (PCoA). The raw species data are first converted into a dissimilarity matrix using a selected dissimilarity metric, and this matrix is submitted to PCoA. The matrix of site scores on all PCoA ordination axes is then used in RDA instead of the raw species data together with explanatory variables. The benefit of db-RDA is that any distance metric can be applied on the data (i.e. not only Euclidean as in RDA, Hellinger (or few others) as in tb-RDA, or chi-square as in CCA). Care must be applied to avoid negative eigenvalues in PCoA, which would be omitted from the analyses; the solution is to either use only metric (Euclidean) dissimilarities, or apply transformation which will turn non-metric dissimilarity into metric one (e.g. square root transformation applied on Bray-Curtis dissimilarity), or using some of the available corrections. Since species information is lost during the calculation of the dissimilarity matrix, the species scores can be added into the final ordination diagram as weighted means of site score in which they occur.