Theory, R functions & Examples
- Sampling design NEW
This example is using simulated ecological data structured by two unequally long ecological gradients. You can imagine these gradients e.g. as elevation (the longer one) and moisture (the shorter one), and the samples as forest communities evenly sampled from low to high elevation, and in each elevation from wet to dry habitats (wet habitats are e.g. those close to the river, while dry habitats are far from the river on a ridge). One dataset (
simul.short) is relatively homogeneous, with both gradients being rather short (low species turnover), while another (
simul.long) is relatively heterogeneous, with long gradients. Again, you can imagine homogeneous dataset as sampling which was done in a relatively narrow range of elevation and moisture, e.g. from 200-400 m in elevation, and from a bit close to a river to a bit farther from the river; the forest communities will differ, but not that much, and individual samples will still share many species. Heterogeneous dataset, in contrary, can be imagined as spanning a broad range of elevations (lowland to high altitudes) and a broad range of moisture (forest beside river periodically flooded vs forest on the dryest ridges far from the river).
Each dataset contains 70 samples, placed evenly along both virtual ecological gradients (distances between neighbouring samples are 200 units in case of the
simul.short dataset and 1000 units in case of
simul.long dataset). Hence, the samples create an even grid in ecological space (note differences in the length of the gradients between the left and the right figure):
I used both matrices to calculate PCA, CA, DCA, tb-PCA (with Hellinger transformation), and PCoA with NMDS, both with Bray-Curtis distances among samples (in case of PCoA, I square rooted the Bray-Curtis distances to make them metric, see details here). Data are presences-absences, so there was no need to transform them. Then, I draw ordination diagrams with individual samples connected in order to form a grid. The visualization of simulated data in ordination, which looks like crumpled grid paper, was inspired by studies of Bruce McCune (McCune 1994 and 1997).
For the dataset
simul.short.spe the result looks like this:
and for the dataset
simul.long.spe like this1):
This example is using simulated ecological data structured by one ecological gradients (you can imagine it as forest sampled along elevation). The question here is: if the length of the gradient increases from short one (e.g. sampling forest in low elevation only) to long one (sampling forest from low to high elevation), how will the ordination of community samples look like when done by PCA/tbPCA/CA/DCA/NMDS?
A set of simulated datasets were created with increasing length of the simulated gradient. The position of community sample along the simulated gradient is visualized by colour (using rainbow palette), with red samples at the beginning and violet sample data the end. The relative length of the given simulated gradient is visualized on the hypothetical schema of the species response curves). The tb-PCA method is based on Hellinger-transformed species composition data.
The figures below show the pattern of ordination diagrams for individual ordination methods (PCA, left, and Hellinger-transformed tb-PCA, right) for short, short-medium, medium-long and long gradients.
The videos below display what happens with the shape of the ordination diagram when the gradient increases.
metaMDSfunction, there may be difference of output between different versions of
vegan 2.0-0, the function
metaMDSuse as a default engine function
monoMDS, which gives slightly different result than original engine function
MASS. Before, the NMDS diagram of simulated data along long gradient looked somewhat like a ball. To see this, you may add the argument
engine = “isoMDS”into