Trace:

en:suppl_vars

This shows you the differences between two versions of the page.

Both sides previous revision Previous revision | Last revision Both sides next revision | ||

en:suppl_vars [2019/03/16 06:20] David Zelený [Multiple testing issue and available corrections] |
en:suppl_vars [2020/04/09 08:42] David Zelený |
||
---|---|---|---|

Line 49: | Line 49: | ||

<imgcaption multiple-testing|Multiple testing issue. I generated two random variables (normally distributed) and tested the significance of their regression with parametric F-test. I replicated this 100 times, each with newly generated random variables. Significant regressions (P < 0.05) are displayed with a red regression line. From a total of 100 analyses, four are significant at the level of 0.05 (almost 5% of all analyses).>{{ :obrazky:multiple-testing-issue.jpg?direct |}}</imgcaption> | <imgcaption multiple-testing|Multiple testing issue. I generated two random variables (normally distributed) and tested the significance of their regression with parametric F-test. I replicated this 100 times, each with newly generated random variables. Significant regressions (P < 0.05) are displayed with a red regression line. From a total of 100 analyses, four are significant at the level of 0.05 (almost 5% of all analyses).>{{ :obrazky:multiple-testing-issue.jpg?direct |}}</imgcaption> | ||

- | The solution is to either avoid doing multiple tests or apply some of the corrections methods. Perhaps the best known is Bonferroni correction, which is however also very conservative (you simply multiply the resulting P-values by the overall number of tests //m// you did in the analysis, P<sub>adj</sub> = P * m) and becomes detrimental in case that the number of tests is high, since it reduces the power of the test. Less conservative are Holm or false discovery rate (FDR) corrections. More about multiple testing issue can be found in my [[https://davidzeleny.net/blog/2019/03/15/about-p-values-and-multiple-testing-issue/|blog post]]. | + | The solution is to either avoid doing multiple tests or apply some of the corrections methods. Perhaps the best known is Bonferroni correction, which is however also very conservative (you simply multiply the resulting P-values by the overall number of tests //m// you did in the analysis, P<sub>adj</sub> = P * m) and becomes detrimental in case that the number of tests is high, since it reduces the power of the test. Less conservative are Holm or false discovery rate (FDR) corrections. More about multiple testing issue can be found in my [[https://davidzeleny.net/blog/2019/03/15/about-p-values-and-multiple-testing-issue/|blog post]] and also in the book //Statistics done wrong// by Alex Reinhart, chapter //[[https://www.statisticsdonewrong.com/p-value.html|The p value and the base rate fallacy]]//. |

In the case of example above using nine random and real supplementary variables and relating them to unconstrained ordination axes, if we apply the multiple testing correction (here Bonferroni, <imgref envfit_adj>), all results in the case of random variables become insignificant (in case of the real variables, one more result become insignificant compared to the not-corrected results). Since in this case, the test is permutational and the minimal P-value depends on the number of permutations, in case that there are many supplementary variables (and many tests), it may be necessary to increase the number of permutations to decrease the minimum P-value which can be calculated. For example, if the number of permutations is set to 199 (e.g. due to the calculation time), the minimum P-value which can be reached is P<sub>min</sub> = 1/(199+1) = 0.005; if there are ten variables and the correction for multiple testing is done by Bonferroni (P-value * number of tests), the best resulting corrected P-value would be 0.005*10 = 0.05, which means that we would be unable to reject the null hypothesis on P < 0.05. | In the case of example above using nine random and real supplementary variables and relating them to unconstrained ordination axes, if we apply the multiple testing correction (here Bonferroni, <imgref envfit_adj>), all results in the case of random variables become insignificant (in case of the real variables, one more result become insignificant compared to the not-corrected results). Since in this case, the test is permutational and the minimal P-value depends on the number of permutations, in case that there are many supplementary variables (and many tests), it may be necessary to increase the number of permutations to decrease the minimum P-value which can be calculated. For example, if the number of permutations is set to 199 (e.g. due to the calculation time), the minimum P-value which can be reached is P<sub>min</sub> = 1/(199+1) = 0.005; if there are ten variables and the correction for multiple testing is done by Bonferroni (P-value * number of tests), the best resulting corrected P-value would be 0.005*10 = 0.05, which means that we would be unable to reject the null hypothesis on P < 0.05. |

en/suppl_vars.txt · Last modified: 2020/04/09 08:42 by David Zelený