TKDE, DOI: 10.1109/TKDE.2019.2941206 (early access)
Abstract. As the rate at which scientific work is published continues to increase, so does the need to discern high-impact publications. In recent years, there have been several approaches that seek to rank publications based on their expected citation-based impact. Despite this level of attention, this research area has not been systematically studied. Past literature often fails to distinguish between short-term impact, the current popularity of an article, and long-term impact, the overall influence of an article. Moreover, the evaluation methodologies applied vary widely and are inconsistent. In this work, we aim to fill these gaps, studying impact-based ranking theoretically and experimentally. First, we provide explicit definitions for short-term and long-term impact, and introduce the associated ranking problems. Then, we identify and classify the most important ideas employed by state-of-the-art methods. After studying various evaluation methodologies of the literature, we propose a specific benchmark framework that can help us better differentiate effectiveness across impact aspects. Using this framework we investigate: (1) the practical difference between ranking by short- and long-term impact, and (2) the effectiveness and efficiency of ranking methods in different settings. To avoid reporting results that are discipline-dependent, we perform our experiments using four datasets from different scientific disciplines.