Boosting toponym interlinking by paying attention to both machine and deep learning
Proceedings of the Sixth International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data. June 2020. Article No.: 4. Pages 1–5
2020
Conference/Workshop
- Contact persons: Giorgos Giannopoulos , Konstantinos Alexis , Vassilis Kaffes
- Relevant research project: LinkGeoML
Abstract.
Toponym interlinking is the problem of identifying same spatio-textual entities within two or more different data sources, based exclusively on their names. It comprises a significant task in geospatial data management and integration with application in fields such as geomarketing, cadastration, navigation, etc. Previous works have assessed the effectiveness of unsupervised string similarity functions, while more recent ones have deployed similarity-based Machine Learning techniques and language model-based Deep Learning techniques, achieving significantly higher interlinking accuracy. In this paper, we demonstrate the suitability of Attention-based neural networks on the problem, as well as the fact that all different approaches provide merit to the problem, proposing a hybrid scheme that achieves the highest accuracy reported on toponym interlinking on the widely used Geonames dataset.