User-Defined Functions in Modern Data Engines Full text

Yannis Foufoulas, Alkis Simitsis
IEEE ICDE 2023
2023
Conference/Workshop
Abstract. Modern data management applications involve complex processing tasks over large volumes of data. Although this falls naturally within the scope of relational databases, many such tasks cannot be expressed in SQL and require additional expressive power achieved via user-defined functions (UDFs). However, efficient processing of UDFs in data engines hinge on dealing with the impedance mismatch between UDF execution and SQL processing. In recent years, the problem of efficient UDF execution in modern data engines has gained significant traction. In this tutorial, we present recent advancements in this area, involving a broad scope of solutions ranging from algebraic, cost-based optimization to low level, physical query optimization, compilation, and execution.We also describe limitations and open issues, and discuss promising future research directions.