User Tools

Site Tools


en:forward_sel_examples

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
en:forward_sel_examples [2019/04/06 08:39]
David Zelený
en:forward_sel_examples [2019/04/06 08:46]
David Zelený [Example 2: CCA with forward selection on data from Carpathian wetlands]
Line 568: Line 568:
 ==== Example 2: CCA with forward selection on data from Carpathian wetlands ==== ==== Example 2: CCA with forward selection on data from Carpathian wetlands ====
  
-This example is using the same data as the Example 1 above, namely composition of vascular plants in [[en:​data:​wetlands|Carpathian wetlands]] and chemical variables measured in fen water running through them. While in the Example 1 we analysed the data using tb-RDA (RDA applied on data after Hellinger transformation),​ here we do this using CCA (note: no Hellinger transformation is done). Since the function ''​forward.sel''​ from ''​adespatial''​ package is using only RDA algorithm, we cannot use it here; instead, we will do the calculation using ''​ordiR2step''​ from ''​vegan'',​ which also implements variable selection with double stopping ​criterie: the selection is finished if the new variable to be selected is not significant at certan ​alpha level, or if the adjusted R<​sup>​2</​sup>​ explained by the model with this variable exceeds the adjusted R<​sup>​2</​sup>​ of the global model. ​+This example is using the same data as the Example 1 above, namely composition of vascular plants in [[en:​data:​wetlands|Carpathian wetlands]] and chemical variables measured in fen water running through them. While in Example 1 we analysed the data using tb-RDA (RDA applied on data after Hellinger transformation),​ here we do this using CCA (note: no Hellinger transformation is done). Since the function ''​forward.sel''​ from ''​adespatial''​ package is using only RDA algorithm, we cannot use it here; instead, we will do the calculation using ''​ordiR2step''​ from ''​vegan'',​ which also implements variable selection with double stopping ​criteria: the selection is finished if the new variable to be selected is not significant at certain ​alpha level, or if the adjusted R<​sup>​2</​sup>​ explained by the model with this variable exceeds the adjusted R<​sup>​2</​sup>​ of the global model. ​
  
 <code rsplus> <code rsplus>
Line 616: Line 616:
 sel.osR2.cca$anova sel.osR2.cca$anova
 </​code>​ </​code>​
 +<​file>​
                   R2.adj Df    AIC      F Pr(>​F) ​   ​                   R2.adj Df    AIC      F Pr(>​F) ​   ​
 + Ca            0.082170 ​ 1 385.90 7.1738 ​ 0.001 *** + Ca            0.082170 ​ 1 385.90 7.1738 ​ 0.001 ***
Line 625: Line 626:
 + Corg          0.136553 ​ 1 387.17 1.3959 ​ 0.027 *  ​ + Corg          0.136553 ​ 1 387.17 1.3959 ​ 0.027 *  ​
 <All variables>​ 0.142700 <All variables>​ 0.142700
-</code>+</file>
  
-The first five variables (Ca, conductivity,​ Si, NH3 and NO3) are the same as in tb-RDA analysis above, the last two differ. If we increase the number of permutations to 49,000 (to reduce the lowest P-value we can obtain - but this takes quite some time!) and adjust the P-values with Holm's correction for multiple testing issue, only the first four variables will be selected (compared to the first five in the case of tb-RDA):+The first five variables (Ca, conductivity,​ Si, NH3 and NO3) are the same as in tb-RDA analysis above, the last two differ. If we increase the number of permutations to 49,999 (to reduce the lowest P-value we can obtain - but this takes quite some time!) and adjust the P-values with Holm's correction for multiple testing issue, only the first four variables will be selected (compared to the first five in the case of tb-RDA):
 <code rsplus> <code rsplus>
 sel.osR2.cca <- ordiR2step (cca.0, scope = formula (cca.all), direction = '​forward',​ R2scope = adjRsq.cca, permutations = 49999, trace = F) sel.osR2.cca <- ordiR2step (cca.0, scope = formula (cca.all), direction = '​forward',​ R2scope = adjRsq.cca, permutations = 49999, trace = F)
Line 635: Line 636:
 </​code>​ </​code>​
 <​file>​ <​file>​
-                  R2.adj Df    AIC      F  Pr(>​F) ​   +                  R2.adj Df    AIC      F  Pr(>F)
 + Ca            0.082144 ​ 1 385.90 7.1738 0.00028 *** + Ca            0.082144 ​ 1 385.90 7.1738 0.00028 ***
 + conduct ​      ​0.095628 ​ 1 385.81 2.0268 0.00156 **  + conduct ​      ​0.095628 ​ 1 385.81 2.0268 0.00156 ** 
Line 647: Line 648:
 </​file>​ </​file>​
  
-You may have noticed that the use of ''​ordiR2step''​ function with CCA takes considerably longer than when applied on RDA (or tb-RDA as above). This is because in the case of CCA, [[en:​expl_var#​adjusted_r2|adjusted ​R<​sup>​2</​sup> ​needs to be calculated using permutation method]] introduced by [[en:​references|Peres-Neto et al. (2006)]], while in the case of RDA the adjusted R<​sup>​2</​sup>​ can be calculated analytically by Ezekiel'​s formula. This also means that the resulting adjusted R<​sup>​2</​sup>​ values will slighly ​change between calculations.+You may have noticed that the use of ''​ordiR2step''​ function with CCA takes considerably longer than when applied on RDA (or tb-RDA as above). This is because in the case of CCA, [[en:​expl_var#​adjusted_r2|adjusted ​R2 needs to be calculated using the permutation method]] introduced by [[en:​references|Peres-Neto et al. (2006)]], while in the case of RDA the adjusted R<​sup>​2</​sup>​ can be calculated analytically by Ezekiel'​s formula. This also means that the resulting adjusted R<​sup>​2</​sup>​ values will slightly ​change between calculations, because the adjusted value calculation is using mean of R2 explained by randomized environmental variables (by default 1000 randomizations is used; this number can be increased by including the argument ''​R2permutations''​ in the ''​ordiR2step''​ function with higher number).
  
  
  
  
en/forward_sel_examples.txt · Last modified: 2019/04/06 08:46 by David Zelený