Classification based hypothesis testing in neuroscience: Below-chance level classification rates and overlooked statistical properties of linear parametric classifiers
Hum Brain Mapp 37: 1842-55. doi: 10.1002/hbm.23140. Epub 2016 Mar 26.
|Type of Publication:||Journal Articles 2001 - 2017|
Multivariate pattern analysis (MVPA) has recently become a popular tool for data analysis. Often, classification accuracy as quantified by correct classification rate (CCR) is used to illustrate the size of the effect under investigation. However, we show that in low sample size (LSS), low effect size (LES) data, which is typical in neuroscience, the distribution of CCRs from cross-validation of linear MVPA is asymmetric and can show classification rates considerably below what would be expected from chance classification. Conversely, the mode of the distribution in these cases is above expected chance levels, leading to a spuriously high number of above chance CCRs. This unexpected distribution has strong implications when using MVPA for hypothesis testing. Our analyses warrant the conclusion that CCRs do not well reflect the size of the effect under investigation. Moreover, the skewness of the null-distribution precludes the use of many standard parametric tests to assess significance of CCRs. We propose that MVPA results should be reported in terms of P values, which are estimated using randomization tests. Also, our results show that cross-validation procedures using a low number of folds, e.g. twofold, are generally more sensitive, even though the average CCRs are often considerably lower than those obtained using a higher number of folds.