We introduce a framework for automatically choosing data structures for efficient query processing. Our contributions are twofold. First, we introduce a novel low-level intermediate language that can express the algorithms behind various query processing paradigms such as classical joins, groupjoin, and in-database machine learning engines. This language is designed around the notion of dictionaries and allows for a more fine-grained choice of its low-level implementation. Second, the cost model for alternative implementations is automatically inferred by combining machine learning and program reasoning. The dictionary cost model is learned using a regression model trained over the profiling data of dictionary operations on a given architecture. Program reasoning helps to infer the expected cost of the whole query by combining the learned dictionary cost estimates.
Our experimental results show the effectiveness of the trained cost model on microbenchmarks. Furthermore, we show that the code generated by our framework outperforms or is competitive with state-of-the-art analytical query and in-database machine learning engines.
Tue 28 FebDisplayed time zone: Eastern Time (US & Canada) change
13:30 - 15:10 | Session 5 -- Domain-Specific Compilation and DebuggingMain Conference at Montreal 1-2-3 Chair(s): Teresa Johnson Google | ||
13:30 26mTalk | Compiling Functions onto Digital Microfluidics Main Conference DOI | ||
13:56 26mTalk | Fine-Tuning Data Structures for Query Processing Main Conference Amir Shaikhha University of Edinburgh, Marios Kelepeshis University of Oxford, Mahdi Ghorbani University of Edinburgh DOI | ||
14:22 26mTalk | D2X: An eXtensible conteXtual Debugger for Modern DSLs Main Conference Ajay Brahmakshatriya Massachusetts Institute of Technology, Saman Amarasinghe Massachusetts Institute of Technology DOI |