This paper presents Arvada, an algorithm for learning context-free grammars from a set of positive examples and a Boolean-valued oracle. Arvada learns a context-free grammar by building parse trees from the positive examples. Starting from initially flat trees, Arvada builds structure to these trees with a key operation: it \emph{bubbles} sequences of sibling nodes in the trees into a new node, adding a layer of indirection to the tree. Bubbling operations enable recursive generalization in the learned grammar. We evaluate Arvada against GLADE and find it achieves on average increases of 4.98$\times$ in recall and 3.13$\times$ in F1 score, while incurring only a 1.27$\times$ slowdown and requiring only 0.87$\times$ as many calls to the oracle. Arvada has a particularly marked improvement over GLADE on grammars with highly recursive structure, like those of programming languages.
Wed 17 NovDisplayed time zone: Hobart change
12:00 - 13:00 | |||
12:00 20mTalk | On Multi-Modal Learning of Editing Source Code Research Papers | ||
12:20 20mTalk | Learning Highly Recursive Input Grammars Research Papers Neil Kulkarni University of California, Berkeley, Caroline Lemieux Microsoft Research, Koushik Sen University of California at Berkeley Link to publication Pre-print | ||
12:40 10mTalk | Learning GraphQL Query Cost Industry Showcase Georgios Mavroudeas Rensselaer Polytechnic Institute, Guillaume Baudart Inria; ENS; PSL University, Alan Cha IBM Research, USA, Martin Hirzel IBM Research, Jim A. Laredo IBM Research, Malik Magdon-Ismail Rensselaer Polytechnic Institute, Louis Mandel IBM Research, USA, Erik Wittern IBM Research |