User Tools

Site Tools


en:classification

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Last revision Both sides next revision
en:classification [2019/03/19 00:21]
David Zelený [Is classification producing “objective” results?]
en:classification [2019/03/19 00:23]
David Zelený [Unsupervised vs supervised classification]
Line 27: Line 27:
  
 ===== Unsupervised vs supervised classification ===== ===== Unsupervised vs supervised classification =====
-We can use classification methods in two alternative modes: unsupervised and supervised. Unsupervised methods ​searches ​for main gradients in the species composition,​ main discontinuities or homogeneous groups of samples, and returns the result that is dependent only on chosen method and internal structure of the dataset. In contrast, ​unsupervised ​classification methods ​uses external criteria to classify the dataset – you can supply them with information about how to process the classification,​ and it will apply it on the existing dataset. In the case of unsupervised classification,​ one is able to modify the results by subjective choices (like clustering algorithm, distance metric, cut-off threshold for forming the groups), but the main results ​is dependent on the dataset and the assignment of samples into groups may change even with slight changes of the dataset (e.g. by adding more samples). In contrast, supervised methods are simply reproducing the classification criteria supplied externally, and assignment of the sample to the group will remain the same despite changes in the structure of the dataset. Examples of unsupervised methods are TWINSPAN or cluster analysis, supervised methods (not discussed in detail on this website) include artificial neural networks (ANN), classification and regression trees (CART), random forests, COCKTAIL (logical formulas, designed for veg. data). Some methods, like K-means clustering, can run in either unsupervised or supervised mode – in the unsupervised mode the method first searches for the centroids of the predefined number of groups and assigns individual samples to these groups, while in the supervised mode the centroids are defined by user and the method just assigns the samples into these predefined groups.+We can use classification methods in two alternative modes: unsupervised and supervised. ​**Unsupervised** methods ​search ​for main gradients in the species composition,​ main discontinuities or homogeneous groups of samples, and returns the result that is dependent only on the chosen method and internal structure of the dataset. In contrast, ​**supervised** ​classification methods ​use external criteria to classify the dataset – you can supply them with information about how to process the classification,​ and it will apply it on the existing dataset. In the case of unsupervised classification,​ one is able to modify the results by subjective choices (like clustering algorithm, distance metric, cut-off threshold for forming the groups), but the main results ​are dependent on the internal structure of the dataset and the assignment of samples into groups may change even with slight changes of the dataset (e.g. by adding more samples). In contrast, supervised methods are simply reproducing the classification criteria supplied externally, and assignment of the sample to the group will remain the same despite changes in the structure of the dataset. ​ 
 + 
 +Examples of unsupervised methods are TWINSPAN or cluster analysis, supervised methods (not discussed in detail on this website) include artificial neural networks (ANN), classification and regression trees (CART), random forests, COCKTAIL (logical formulas, designed for veg. data). Some methods, like K-means clustering, can run in either unsupervised or supervised mode – in the unsupervised mode the method first searches for the centroids of the predefined number of groups and assigns individual samples to these groups, while in the supervised mode the centroids are defined by user and the method just assigns the samples into these predefined groups.
  
  
en/classification.txt · Last modified: 2019/03/19 08:27 by David Zelený