# Analysis of community ecology data in R

David Zelený

### Others

en:hier-agglom_r

Section: Numerical classification

## Cluster analysis (hierarchical agglomerative classification)

### R functions

• `hclust` - calculates hierarchical cluster analysis. Requires at least two arguments: `d` for distance matrix, and `method` for agglomerative algorithm, one of `ward.D`, `ward.D2`, `single`, `complete`, `average` (= UPGMA), `mcquitty` (= WPGMA), `median` (= WPGMC) or `centroid` (= UPGMC). Has it's own `plot` function.
• `rect.hclust` - divides dendrogram into given number of groups (argument `k`) and draws rectangles around samples in these groups (argument `border` specifies the color of the rectangle).
• `cutree` - cuts the tree (dendrogram) into given number of clusters (argument `k`) or according to given level of similarity (argument `h`). Returns vector with assignment of samples into groups.
• `agnes` (library cluster) - contains six agglomerative algorithms, some not included in `hclust`. Has it's own `plot` method.
• library (dendextend) - contains several functions improving representation of the dendrogram (e.g. plotting dendrogram with branches of different colour)

#### Note about Ward's hierarchical clustering algorithm

Murtagh & Legendre (2014) have shown that what literature refers to as Ward's clustering algorithm are in fact two slightly different methods, while only one of them is identical with the algorithm originally described by Ward. Both functions `hclust` and `agnes` have the `method = 'ward`', but with different default. While `hclust` function implements both Ward's algorithms (the genuine one, named `ward.D2`, as well as the second one, called `ward.D`), the `agnes` function implements only the genuine one. For historical reason, the argument `method = 'ward' ` in `hclust` calls the `ward.D` algorithm instead of `ward.D2` one. This means that `hclust` and `agnes` function, if both to set to `method = 'ward' `, return slighly different results. To calculate “genuine” Ward's algorithm in both methods, you need to set up `method = 'ward.D2' ` in `hclust` (and `method = 'ward' ` in `agnes`, but there is no other option for Ward algorithm anyway).