Grammar-Based Testing for Little Languages: An Experience Report with Student Compilers (SLE 2020 - SLE (Software Language Engineering) 2020)

Who

Phillip van Heerden, Moeketsi Raselimo, Konstantinos (Kostis) Sagonas, Bernd Fischer

Track

SLE 2020

Time Zone

The program is currently displayed in (GMT-06:00) Central Time (US & Canada).

Use conference time zone: (GMT-06:00) Central Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 16 Nov 2020 13:20 - 13:40 at SPLASH-III - Chair(s): Vadim Zaytsev
Tue 17 Nov 2020 01:20 - 01:40 at SPLASH-III - Chair(s): Vadim Zaytsev

Abstract

We report on our experience in using various grammar-based test suite
generation methods to test 61 single-pass compilers that undergraduate
students submitted for the practical project of a computer
architecture course.

We show that
(1) all test suites constructed systematically following different
grammar coverage criteria fall far behind the instructor's test suite
in achieved code coverage, in the number of triggered semantic errors,
and in detected failures and crashes;
(2) a medium-sized positive random test suite triggers more crashes
than the instructor's test suite, but achieves lower code coverage and
triggers fewer non-crashing errors;
and
(3) a combination of the systematic and random test suites performs
as well or better than the instructor's test suite in all aspects and
identifies errors or crashes in every single submission.

We then develop a light-weight extension of the basic grammar-based testing
framework to capture contextual constraints, by encoding scoping and
typing information as ``semantic mark-up tokens'' in the grammar rules.
These mark-up tokens are interpreted by a small generic core engine
when the tests are rendered, and tests with a
syntactic structure that cannot be completed into a valid program by
choosing appropriate identifiers are discarded.
%
We formalize individual error models by overwriting individual mark-up tokens,
and generate tests that are guaranteed to break specific contextual
properties of the language. We show that a fully automatically
generated random test suite with 15 error models achieves roughly the
same coverage as the instructor's test suite, and outperforms it in the
number of triggered semantic errors and detected failures and crashes.
Moreover, all failing tests indicate real errors, and we have
detected errors even in the instructor's reference implementation.

Link to Publication

https://dl.acm.org/doi/pdf/10.1145/3426425.3426946

DOI

https://doi.org/10.1145/3426425.3426946

Phillip van Heerden

Stellenbosch University

Moeketsi Raselimo

Stellenbosch University, South Africa

Konstantinos (Kostis) Sagonas

Uppsala University, Sweden

Sweden

Bernd Fischer

Stellenbosch University, South Africa

South Africa

Media