Coverage and diversity aware top-k query for spatio-temporal posts Full text

Paras Mehta, Dimitrios Skoutas, Dimitris Sacharidis, Agnès Voisard
Proceedings of the 24th ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL) 2016: 37:1-37:10
Abstract. Large amounts of user-generated content are posted daily on the Web, including textual, spatial and temporal information. Exploiting this content to detect, analyze and monitor events and topics that have a potentially large span in space and time requires efficient retrieval and ranking based on criteria including all three dimensions. In this paper, we introduce a novel type of spatial-temporal-keyword query that combines keyword search with the task of maximizing the spatio-temporal coverage and diversity of the returned top-f results. We first describe a baseline algorithm based on related search results diversification problems. Then, we develop an efficient approach which exploits a hybrid spatial-temporal-keyword index to drastically reduce query execution time. To that end, we extend two state-of-the- art indices for top-k spatio-textual queries and describe how our proposed approach can be applied on top of them. We evaluate the efficiency of our algorithms by conducting experiments on two large, real-world datasets containing geotagged tweets and photos.