IMSI, Publication Details: Efficient Calculation of Empirical P-values for AssociationTesting of Binary Classifications

Home PublicationsEfficient Calculation of Empirical P-values for AssociationTesting of Binary Classifications

Efficient Calculation of Empirical P-values for AssociationTesting of Binary Classifications

Kostis Zagganas, Thanasis Vergoulis, Spiros Skiadopoulos, Theodore Dalamagas

SSDBM 2020

2020

Conference/Workshop

Contact persons: Konstantinos Zagganas , Thanasis Vergoulis , Theodore Dalamagas
Relevant research project: ELIXIR-GR

Abstract. Investigating whether two different classifications of a population are associated, is an interesting problem in many scientific fields. For this reason, various statistical tests to reveal this type of associations have been developed, with the most popular of them being Fisher’s exact test. However it has lately been shown that in some cases this test fails to produce accurate results. An alternative approach, known as randomization tests, was introduced to alleviate this issue, however, such tests are computationally intensive. In this paper, we introduce two novel indexing approaches that exploit frequently occurring patterns in classifications to avoid performing redundant computations during the analysis. We conduct a comprehensive set of experiments using real datasets and application scenarios to show that our approaches always outperform the state-of-the-art, with one approach being faster by an order of magnitude.