User Tools

Site Tools


en:forward_sel_examples

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Last revision Both sides next revision
en:forward_sel_examples [2019/04/06 08:39]
David Zelený
en:forward_sel_examples [2019/04/06 08:43]
David Zelený [Example 2: CCA with forward selection on data from Carpathian wetlands]
Line 568: Line 568:
 ==== Example 2: CCA with forward selection on data from Carpathian wetlands ==== ==== Example 2: CCA with forward selection on data from Carpathian wetlands ====
  
-This example is using the same data as the Example 1 above, namely composition of vascular plants in [[en:​data:​wetlands|Carpathian wetlands]] and chemical variables measured in fen water running through them. While in the Example 1 we analysed the data using tb-RDA (RDA applied on data after Hellinger transformation),​ here we do this using CCA (note: no Hellinger transformation is done). Since the function ''​forward.sel''​ from ''​adespatial''​ package is using only RDA algorithm, we cannot use it here; instead, we will do the calculation using ''​ordiR2step''​ from ''​vegan'',​ which also implements variable selection with double stopping ​criterie: the selection is finished if the new variable to be selected is not significant at certan ​alpha level, or if the adjusted R<​sup>​2</​sup>​ explained by the model with this variable exceeds the adjusted R<​sup>​2</​sup>​ of the global model. ​+This example is using the same data as the Example 1 above, namely composition of vascular plants in [[en:​data:​wetlands|Carpathian wetlands]] and chemical variables measured in fen water running through them. While in Example 1 we analysed the data using tb-RDA (RDA applied on data after Hellinger transformation),​ here we do this using CCA (note: no Hellinger transformation is done). Since the function ''​forward.sel''​ from ''​adespatial''​ package is using only RDA algorithm, we cannot use it here; instead, we will do the calculation using ''​ordiR2step''​ from ''​vegan'',​ which also implements variable selection with double stopping ​criteria: the selection is finished if the new variable to be selected is not significant at certain ​alpha level, or if the adjusted R<​sup>​2</​sup>​ explained by the model with this variable exceeds the adjusted R<​sup>​2</​sup>​ of the global model. ​
  
 <code rsplus> <code rsplus>
Line 616: Line 616:
 sel.osR2.cca$anova sel.osR2.cca$anova
 </​code>​ </​code>​
 +<​file>​
                   R2.adj Df    AIC      F Pr(>​F) ​   ​                   R2.adj Df    AIC      F Pr(>​F) ​   ​
 + Ca            0.082170 ​ 1 385.90 7.1738 ​ 0.001 *** + Ca            0.082170 ​ 1 385.90 7.1738 ​ 0.001 ***
Line 625: Line 626:
 + Corg          0.136553 ​ 1 387.17 1.3959 ​ 0.027 *  ​ + Corg          0.136553 ​ 1 387.17 1.3959 ​ 0.027 *  ​
 <All variables>​ 0.142700 <All variables>​ 0.142700
-</code>+</file>
  
 The first five variables (Ca, conductivity,​ Si, NH3 and NO3) are the same as in tb-RDA analysis above, the last two differ. If we increase the number of permutations to 49,000 (to reduce the lowest P-value we can obtain - but this takes quite some time!) and adjust the P-values with Holm's correction for multiple testing issue, only the first four variables will be selected (compared to the first five in the case of tb-RDA): The first five variables (Ca, conductivity,​ Si, NH3 and NO3) are the same as in tb-RDA analysis above, the last two differ. If we increase the number of permutations to 49,000 (to reduce the lowest P-value we can obtain - but this takes quite some time!) and adjust the P-values with Holm's correction for multiple testing issue, only the first four variables will be selected (compared to the first five in the case of tb-RDA):
Line 635: Line 636:
 </​code>​ </​code>​
 <​file>​ <​file>​
-                  R2.adj Df    AIC      F  Pr(>​F) ​   +                  R2.adj Df    AIC      F  Pr(>F)
 + Ca            0.082144 ​ 1 385.90 7.1738 0.00028 *** + Ca            0.082144 ​ 1 385.90 7.1738 0.00028 ***
 + conduct ​      ​0.095628 ​ 1 385.81 2.0268 0.00156 **  + conduct ​      ​0.095628 ​ 1 385.81 2.0268 0.00156 ** 
Line 647: Line 648:
 </​file>​ </​file>​
  
-You may have noticed that the use of ''​ordiR2step''​ function with CCA takes considerably longer than when applied on RDA (or tb-RDA as above). This is because in the case of CCA, [[en:​expl_var#​adjusted_r2|adjusted ​R<​sup>​2</​sup> ​needs to be calculated using permutation method]] introduced by [[en:​references|Peres-Neto et al. (2006)]], while in the case of RDA the adjusted R<​sup>​2</​sup>​ can be calculated analytically by Ezekiel'​s formula. This also means that the resulting adjusted R<​sup>​2</​sup>​ values will slighly ​change between calculations.+You may have noticed that the use of ''​ordiR2step''​ function with CCA takes considerably longer than when applied on RDA (or tb-RDA as above). This is because in the case of CCA, [[en:​expl_var#​adjusted_r2|adjusted ​R2 needs to be calculated using the permutation method]] introduced by [[en:​references|Peres-Neto et al. (2006)]], while in the case of RDA the adjusted R<​sup>​2</​sup>​ can be calculated analytically by Ezekiel'​s formula. This also means that the resulting adjusted R<​sup>​2</​sup>​ values will slightly ​change between calculations.
  
  
  
  
en/forward_sel_examples.txt · Last modified: 2019/04/06 08:46 by David Zelený