Reproducible experiments is an important pillar of well-founded research. Having benchmarks that are publicly available and representative of real-world applications is an important step towards that; it allows us to measure the results of a tool in terms of its precision, recall and overall accuracy. Having such benchmarks is different from having a corpus of programs—a benchmark needs to have labelled data that can be used as ground truth when measuring precision and recall.
With the increased advent in Artifact Evaluation Committees in most PL/SE conferences, reproducibility studies are making their way to the CFP of top conferences such as ECOOP and ISSTA. In some domains, there are established benchmarks used by a community, however, in other domains, the lack of a benchmark prevents researchers from measuring the true value of their newly developed technique.
BenchWork aims at providing a platform for researchers and practitioners to share their experience and thoughts, discussing key learnings from the PL and SE communities, to be able to improve on the sets of benchmarks that are available, or in some cases start/continue the discussion on developing a new benchmark, and their role in research and industry.
Supported By
Talks
Call for Talks
We welcome contributions in the form of talk abstracts within (but not limited to) the following topics:
- Experiences with benchmarking in the areas of program-analysis (e.g., finding bugs, measuring points-to sets)
- Experiences with benchmarking of virtual machines (e.g., measuring memory management overhead)
- Experiences with benchmarking in the areas of software engineering (e.g., clone detection, testing techniques)
- Infrastructure related to support of a benchmark over time, across different versions of the relevant programs
- Metrics that are valuable in the context of incomplete programs
- Support for dynamic analysis, where the benchmark programs need to be run
- Automation of creation of benchmarks
- Licensing issues
- What types of program should be included in program-analysis benchmarks?
- What type of analysis do you perform?
- What build systems do your tool support?
- What program-analysis benchmarks do you typically use? What are their pros and cons?
- What are the useful metrics to consider when creating program-analysis benchmarks?
- How can we handle incomplete code in benchmarks?
- How can program-analysis benchmarks provide good support for dynamic analyses?
- How can we automate the creation of program-analysis benchmarks?
Wed 18 JulDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
11:00 - 12:30 | |||
11:00 10m | Opening Remarks BenchWork | ||
11:10 30m | Real World Benchmarks for JavaScript BenchWork File Attached | ||
11:40 20m | In Search of Accurate Benchmarking BenchWork Edd Barrett King's College London, Sarah Mount King's College London, Laurence Tratt King's College London File Attached | ||
12:00 30m | AndroZoo: Lessons Learnt After 2 Years of Running a Large Android App Collection BenchWork Kevin Allix University of Luxembourg |
14:00 - 15:30 | |||
14:00 30m | Benchmarking WebKit BenchWork Saam Barati Apple File Attached | ||
14:30 20m | Analyzing Duplication in JavaScript BenchWork Petr Maj Czech Technical University, Celeste Hollenbeck Northeastern University, USA, Shabbir Hussain Northeastern University, Jan Vitek Northeastern University | ||
14:50 20m | Building a Node.js Benchmark: Initial Steps BenchWork Petr Maj Czech Technical University, François Gauthier Oracle Labs, Celeste Hollenbeck Northeastern University, USA, Jan Vitek Northeastern University, Cristina Cifuentes Oracle Labs File Attached | ||
15:10 20m | A Micro-Benchmark for Dynamic Program Behaviour BenchWork Li Sui Massey University, New Zealand, Jens Dietrich Massey University, Michael Emery Massey University, Amjed Tahir Massey University, Shawn Rasheed Massey University |
16:00 - 17:40 | |||
16:00 30m | InspectorClone: Evaluating Precision of Clone Detection Tools BenchWork | ||
16:30 20m | Towards a Data-Curation Platform for Code-Centric Research BenchWork Ben Hermann University of Paderborn, Lisa Nguyen Quang Do Paderborn University, Eric Bodden Heinz Nixdorf Institut, Paderborn University and Fraunhofer IEM File Attached | ||
16:50 20m | The Architecture Independent Workload Characterization BenchWork Beau Johnston Australian National University File Attached | ||
17:10 30m | Performance Monitoring in Eclipse OpenJ9 BenchWork Andrew Craik IBM |
Student Travel Support
BenchWork has some limited funding to support travel, accommodation, or registration for students who are undertaking studies in programming languages or software engineering and want to participate in the workshop. Funding will be available to students who have not received additional travel support.
To apply, please email Karim Ali your name and affiliation, supervisor name, topic of study for your Master’s or PhD, type of funding requested (travel, accommodation, registration) and cost. Application deadline is July 1st.