Theory, Examples & Exercises
This is an old revision of the document!
This simple example shows correspondence analysis of Danube meadow dataset, collected by Heinz Ellenberg and used in number of methodological studies.
First, import data, initiate vegan library and calculate CA using the function
cca from vegan:
danube.spe <- read.delim ('https://raw.githubusercontent.com/zdealveindy/anadat-r/master/data/danube.spe.txt', row.names = 1) library (vegan) CA <- cca (danube.spe) CA
Call: cca(X = danube.spe) Inertia Rank Total 3.043 Unconstrained 3.043 24 Inertia is mean squared contingency coefficient Eigenvalues for unconstrained axes: CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8 0.5718 0.4944 0.2950 0.2472 0.2057 0.1764 0.1528 0.1418 (Showed only 8 of all 24 unconstrained eigenvalues)
The total inertia (heterogeneity of the dataset) is 3.043, and the first axis captures 18.8% of total variation in species composition (0.5718 / 3.043 = 0.1879, where 0.5718 is eigenvalue of the first axis CA1, and 3.043 is total inertia).
Ordination diagram reveals the pattern of samples and species in ordination diagram:
It is evident that sample 19 is quite different from the rest of data, and correspondence analysis even greatly exaggerates this difference. Here is where the detrending of ordination axes comes as an option (see further, and for DCA on Danube meadow dataset see Exercise 3 below).
To decide whether the compositional data are homogeneous or heterogeneous, respectively (and thus more suitable for linear or unimodal ordination methods, respectively), we may calculate detrended correspondance analysis (DCA) first and check the length of the first ordination axis (in units of S.D.) to decide.
vltava.spe <- read.delim ('https://raw.githubusercontent.com/zdealveindy/anadat-r/master/data/vltava-spe.txt', row.names = 1) DCA <- decorana (log1p (vltava.spe)) DCA
Call: decorana(veg = log1p(vltava.spe)) Detrended correspondence analysis with 26 segments. Rescaling of axes with 4 iterations. DCA1 DCA2 DCA3 DCA4 Eigenvalues 0.5338 0.3956 0.2488 0.2457 Decorana values 0.5533 0.3677 0.2410 0.1961 Axis lengths 4.5446 3.5426 2.9208 2.9726
The length of first axis is 4.5 S.D. units, which means that (according to Lepš & Šmilauer 2003) unimodal ordination methods are preferable.
I won't draw ordination diagram of the result here - since this example is focused only on the decision whether the
vltava dataset is homogeneous or heterogeneous, it is not relevant here. However, note that the section Ordination diagrams is devoted to drawing ordination diagrams using this dataset.
You may be surprised that you haven't get any total inertia values when printing decorana results, although in other software (e.g. CANOCO) these are available, together with percentage variance explained by particular axes. The reason for this is that DCA does not support the concept of total inertia values (also, it produces only four axes, i.e. four eigenvalues). However, you may get total inertia applying correspondence analysis (CA) on your data:
cca (log1p (vltava.spe))
Call: cca(X = log1p(vltava.spe)) Inertia Rank Total 7.372 Unconstrained 7.372 96 Inertia is mean squared contingency coefficient Eigenvalues for unconstrained axes: CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8 0.5533 0.4594 0.4131 0.3083 0.2951 0.2576 0.2147 0.2032 (Showed only 8 of all 96 unconstrained eigenvalues)
As you can see, total inertia is 7.372, and if needed, variation captured by particular axes can be calculated as eigenvalue/total inertia (e.g., for the first axis, 0.553/7.372*100 = 7.50%)1)