CGO 2023
Sat 25 February - Wed 1 March 2023 Montreal, Canada

We introduce a framework for automatically choosing data structures for efficient query processing. Our contributions are twofold. First, we introduce a novel low-level intermediate language that can express the algorithms behind various query processing paradigms such as classical joins, groupjoin, and in-database machine learning engines. This language is designed around the notion of dictionaries and allows for a more fine-grained choice of its low-level implementation. Second, the cost model for alternative implementations is automatically inferred by combining machine learning and program reasoning. The dictionary cost model is learned using a regression model trained over the profiling data of dictionary operations on a given architecture. Program reasoning helps to infer the expected cost of the whole query by combining the learned dictionary cost estimates.
Our experimental results show the effectiveness of the trained cost model on microbenchmarks. Furthermore, we show that the code generated by our framework outperforms or is competitive with state-of-the-art analytical query and in-database machine learning engines.

Tue 28 Feb

Displayed time zone: Eastern Time (US & Canada) change

13:30 - 15:10
Session 5 -- Domain-Specific Compilation and DebuggingMain Conference at Montreal 1-2-3
Chair(s): Teresa Johnson Google
13:30
26m
Talk
Compiling Functions onto Digital Microfluidics
Main Conference
Tyson Loveless Intel Corporation, Philip Brisk University of California
DOI
13:56
26m
Talk
Fine-Tuning Data Structures for Query Processing
Main Conference
Amir Shaikhha University of Edinburgh, Marios Kelepeshis University of Oxford, Mahdi Ghorbani University of Edinburgh
DOI
14:22
26m
Talk
D2X: An eXtensible conteXtual Debugger for Modern DSLs
Main Conference
Ajay Brahmakshatriya Massachusetts Institute of Technology, Saman Amarasinghe Massachusetts Institute of Technology
DOI