David Zelený

en:non-hier

# Differences

This shows you the differences between two versions of the page.

 en:non-hier [2019/03/22 21:58]David Zelený en:non-hier [2019/04/06 18:48]David Zelený [K-means (non-hierarchical classification)] Both sides previous revision Previous revision 2019/04/06 18:53 David Zelený 2019/04/06 18:48 David Zelený [K-means (non-hierarchical classification)] 2019/04/06 18:47 David Zelený 2019/03/22 21:58 David Zelený 2019/01/26 20:20 David Zelený 2017/10/11 20:36 external edit2017/04/22 08:58 David Zelený 2016/06/28 16:29 external edit2014/12/19 02:18 David Zelený created Next revision Previous revision 2019/04/06 18:53 David Zelený 2019/04/06 18:48 David Zelený [K-means (non-hierarchical classification)] 2019/04/06 18:47 David Zelený 2019/03/22 21:58 David Zelený 2019/01/26 20:20 David Zelený 2017/10/11 20:36 external edit2017/04/22 08:58 David Zelený 2016/06/28 16:29 external edit2014/12/19 02:18 David Zelený created Last revision Both sides next revision Line 7: Line 7: [[{|width: 7em; background-color:​ white; color: navy}non-hier_exercise|Exercise {{::​lock-icon.png?​nolink|}}]] [[{|width: 7em; background-color:​ white; color: navy}non-hier_exercise|Exercise {{::​lock-icon.png?​nolink|}}]] - ==== kmeans ==== + This is a non-hierarchical ​agglomerative clustering algorithm, based on Euclidean distances among samples and using an iterative algorithm to find the solution. It minimizes the total error sum of squares (TESS), the same objective function as in the case of Ward’s algorithm. The number of clusters (k) is defined by the user. Other than Euclidean distance can be used, but they need to be converted into metric distances ​and submitted to PCoA. For example, in the case of Bray-Curtis distance, which is not metric, one may calculate square-rooted Bray-Curtis distances (which are metric), submit them to PCoA, and then use all PCoA axes as the input matrix in K-means method instead ​of the raw data. The K-means algorithm, similarly to other iterative methods (like NMDS) can get trapped in local minima, and it may be useful to repeat the analysis many times and choose the solution with the lowest overall TESS. - Non-hierarchical ​classification, using method ​of //k// means. Non-hierarchical methods are overlooked, even if they give ecological interesting ​and relevant results. You need to a priori set up the number ​of clusters you want the data divide into. + - - cluster.kmeans <- kmeans (dis, centers = 5) - cluster.kmeans\$cluster - ​ - <​file>​ - ​1 ​ 2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 - ​4 ​ 4  4  4  5  5  1  2  4  4  5  4  4  4  5  2  1  2  4  1  4  5  2  2  4  3 - 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 - ​2 ​ 2  2  5  5  1  1  1  1  1  2  4  4  4  2  2  2  5  4  4  4  1  5  1  1  2 - 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 - ​2 ​ 3  5  3  3  3  3  3  3  3  3  5  3  3  3  3  3  2  2  2  2  2  3  4  4  5 - 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 - ​3 ​ 3  3  5  3  3  3  5  4  4  4  5  2  1  2  5  2  1  1 -