QFusor: A UDF Optimizer Plugin for SQL Databases Full text

K. Chasialis, T. Palaiologou, Y. Foufoulas, A. Simitsis, Y. Ioannidis
Abstract. Modern data applications in areas such as text mining, document analysis, and data science, involve complex algorithms and logic that cannot be expressed in SQL. Therefore, SQL databases employ user-defined functions (UDFs) to extend their supported functionality. However, this comes at a significant performance cost as UDFs routinely become the bottleneck in query execution. To deal with this problem, we present QFusor, an optimizer plugin for UDF queries in relational databases. QFusor minimizes the performance overheads introduced by the impedance mismatch between the UDF and SQL execution environments by employing techniques such as vectorization, parallelization, tracing JIT compilation, and operator fusion for various types of UDF (scalar, aggregate, table UDFs) and relational operators. QFusor follows a pluggable, engine-agnostic design and can work with several popular SQL databases offering a significant boost in their UDF query performance.