ASE 2021
Sun 14 - Sat 20 November 2021 Australia
Wed 17 Nov 2021 12:20 - 12:40

This paper presents Arvada, an algorithm for learning context-free grammars from a set of positive examples and a Boolean-valued oracle. Arvada learns a context-free grammar by building parse trees from the positive examples. Starting from initially flat trees, Arvada builds structure to these trees with a key operation: it \emph{bubbles} sequences of sibling nodes in the trees into a new node, adding a layer of indirection to the tree. Bubbling operations enable recursive generalization in the learned grammar. We evaluate Arvada against GLADE and find it achieves on average increases of 4.98$\times$ in recall and 3.13$\times$ in F1 score, while incurring only a 1.27$\times$ slowdown and requiring only 0.87$\times$ as many calls to the oracle. Arvada has a particularly marked improvement over GLADE on grammars with highly recursive structure, like those of programming languages.

Wed 17 Nov

