Toulouse: Learning Join Order Optimization Policies for Rule-based Data Engines Full text

Antonios Karvelas, Yannis Foufoulas, Alkis Simitsis, Yannis Ioannidis
Abstract. In recent times, several research works have explored the idea of leveraging machine learning techniques to improve or even replace core components of traditional database architectures, such as the query optimizer and selectivity and cardinality cost estimators. These efforts often rely on existing, cost-based optimizers and cost models to avoid a cold-start, and build on top of the optimizer's decisions. In this paper, we investigate whether learning could also be beneficial in rule-based optimizers for known and unknown workloads alike. As a proof of concept, we use MonetDB, an open-source, column-store analytics data engine, and explore whether a learning model based on Graph Neural Networks that is trained on a cost-based engine, such as PostgreSQL, could improve MonetDB optimizer's decisions. Our experimental results reveal deficiencies in MonetDB's query execution plans, especially for queries with long chains of join operators, and potential opportunities in exploiting learning techniques.