# Analysis of community ecology data in R

David Zelený

### Others

Author: David Zelený en:pcoa_nmds

This is an old revision of the document!

# Ordination analysis

## PCoA & NMDS (distance-based unconstrained ordination)

### Theory #### Principal Correspondence Analysis (PCoA)

This method is also known as MDS (Metric Multidimensional Scaling). While PCA preserves Euclidean distances among samples and CA chi-square distances, PCoA provides Euclidean representation of a set of objects whose relationship is measured by any dissimilarity index. As well as PCA and CA, PCoA returns a set of orthogonal axes whose importance is measured by eigenvalues. This means that calculating PCoA on Euclidean distances among samples yields the same results as PCA calculated on the covariance matrix of the same dataset (if scaling 1 is used), and PCoA on Chi-square distances similar results to CA (but not identical, because CA is applying the weights in the calculation). In case of using non-metric (non-Euclidean) distance index, the PCoA may produce axes with negative eigenvalues which cannot be plotted. Solution to this is to either convert the non-metric dissimilarity index to metric one (e.g. Bray-Curtis dissimilarity is non-metric, but after square-root transformation becomes metric) or using specific corrections (Lingoes or Cailliez). Since the PCoA algorithm is based on the matrix of dissimilarities between samples, the species scores are not calculated; however, the species can be projected to the ordination diagram as weighted means or correlations, similarly as supplementary environmental variables.

#### Non-metric Multidimensional Scaling (NMDS)

Non-metric alternative to PCoA analysis - it can use any distance measure among samples, and the main focus is on projecting the relative position of sample points into low dimensional ordination space (two or three axes). The method is distance based, not eigenvalue based - it means that it does not attempt to maximize the variance preserved by particular ordination axes and resulting projection could therefore be rotated in any direction.

The algorithm goes like this (simplified):

1. Specify the number of dimensions m you want to use (into which you want to scale down the distribution of samples in multidimensional space - that's why it's scaling).
2. Construct initial configuration of all samples in m dimensions as a starting point of iterative process. The result of the whole iteration procedure may depend on this step, so it's somehow crucial - the initial configuration could be generated by random, but better way is to help it a bit, e.g. by using PCoA ordination as a starting position.
3. An iterative procedure tries to reshuffle the objects in given number of dimension in such a way that the real distances among objects reflects best their compositional dissimilarity. Fit between these two parameters is expressed as so called stress value - the lower stress value the better.
4. Algorithm stops when new iteration cannot lower the stress value - the solution has been reached.
5. After the algorithm is finished, the final solution is rotated using PCA to ease its interpretation (that's why final ordination diagram has ordination axes, even if original algorithm doesn't produce any). 