User Tools

Site Tools


en:data_preparation_examples

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
en:data_preparation_examples [2017/03/01 13:20]
David Zelený
en:data_preparation_examples [2019/01/21 00:36]
David Zelený
Line 8: Line 8:
 As an example how to detect missing values in a matrix, let's use [[en:​data:​danube|Danube meadow dataset]] with Ellenberg indicator values for individual species (this is a dataset with species attributes, with species in rows and tabulated Ellenberg indicator values in columns): As an example how to detect missing values in a matrix, let's use [[en:​data:​danube|Danube meadow dataset]] with Ellenberg indicator values for individual species (this is a dataset with species attributes, with species in rows and tabulated Ellenberg indicator values in columns):
 <code rsplus> <code rsplus>
-danube.ell <- read.delim ('http://www.davidzeleny.net/​anadat-r/​data-download/​danube.ell.txt',​ row.names = 1)+danube.ell <- read.delim ('https://raw.githubusercontent.com/​zdealveindy/anadat-r/master/​data/​danube.ell.txt',​ row.names = 1)
    
 </​code>​ </​code>​
Line 17: Line 17:
 </​code>​ </​code>​
 <​code>​ <​code>​
-  ​Light Temp Cont Moist React Nutr  +     Light            Temp            Cont           ​Moist            React            Nutr       
- Min. :4.000 Min. :4.000 Min. :2.000 Min. : 2.000 Min. :3.000 Min. :1.000  + ​Min. ​  ​:​4.000 ​  ​Min.   ​:​4.000 ​  ​Min.   ​:​2.000 ​  ​Min.   ​: 2.000   ​Min.   ​:​3.000 ​  ​Min.   ​:​1.000 ​  
- 1st Qu.:6.250 1st Qu.:5.000 1st Qu.:3.000 1st Qu.: 4.000 1st Qu.:6.500 1st Qu.:4.000  + 1st Qu.:​6.250 ​  ​1st Qu.:​5.000 ​  ​1st Qu.:​3.000 ​  ​1st Qu.: 4.000   ​1st Qu.:​6.500 ​  ​1st Qu.:​4.000 ​  
- ​Median :7.000 Median :5.000 Median :3.000 Median : 5.000 Median :7.000 Median :5.000  + ​Median :​7.000 ​  ​Median :​5.000 ​  ​Median :​3.000 ​  ​Median : 5.000   ​Median :​7.000 ​  ​Median :​5.000 ​  
- Mean :6.889 Mean :5.435 Mean :3.901 Mean : 5.524 Mean :6.851 Mean :4.938  + ​Mean ​  ​:​6.889 ​  ​Mean   ​:​5.435 ​  ​Mean   ​:​3.901 ​  ​Mean   ​: 5.524   ​Mean   ​:​6.851 ​  ​Mean   ​:​4.938 ​  
- 3rd Qu.:7.000 3rd Qu.:6.000 3rd Qu.:5.000 3rd Qu.: 6.000 3rd Qu.:7.000 3rd Qu.:6.000  + 3rd Qu.:​7.000 ​  ​3rd Qu.:​6.000 ​  ​3rd Qu.:​5.000 ​  ​3rd Qu.: 6.000   ​3rd Qu.:​7.000 ​  ​3rd Qu.:​6.000 ​  
- Max. :8.000 Max. :6.000 Max. :7.000 Max. :10.000 Max. :8.000 Max. :9.000  + ​Max. ​  ​:​8.000 ​  ​Max.   ​:​6.000 ​  ​Max.   ​:​7.000 ​  ​Max.   ​:​10.000 ​  ​Max.   ​:​8.000 ​  ​Max.   ​:​9.000 ​  
- ​NA'​s :4 NA's :48 NA's :23 NA's :10 NA's :47 NA's :13 + ​NA'​s ​  ​:4       ​NA'​s ​  ​:48      NA'​s ​  ​:23      NA'​s ​  ​:10       ​NA'​s ​  ​:47      NA'​s ​  ​:13     ​
 </​code>​ </​code>​
 The bottom row in the output of ''​summary''​ shows the number of missing values in each variable. But which values are missing? The bottom row in the output of ''​summary''​ shows the number of missing values in each variable. But which values are missing?
en/data_preparation_examples.txt · Last modified: 2019/01/21 00:36 by David Zelený