Blogs (61) >>

Reproducible experiments is an important pillar of well-founded research. Having benchmarks that are publicly available and representative of real-world applications is an important step towards that; it allows us to measure the results of a tool in terms of its precision, recall and overall accuracy. Having such benchmarks is different from having a corpus of programs—a benchmark needs to have labelled data that can be used as ground truth when measuring precision and recall.

With the increased advent in Artifact Evaluation Committees in most PL/SE conferences, reproducibility studies are making their way to the CFP of top conferences such as ECOOP and ISSTA. In some domains, there are established benchmarks used by a community, however, in other domains, the lack of a benchmark prevents researchers from measuring the true value of their newly developed technique.

BenchWork aims at providing a platform for researchers and practitioners to share their experience and thoughts, discussing key learnings from the PL and SE communities, to be able to improve on the sets of benchmarks that are available, or in some cases start/continue the discussion on developing a new benchmark, and their role in research and industry.

Supported By

Oracle Labs

Talks

Title
A Micro-Benchmark for Dynamic Program Behaviour
BenchWork
Analyzing Duplication in JavaScript
BenchWork
AndroZoo: Lessons Learnt After 2 Years of Running a Large Android App Collection
BenchWork
Benchmarking WebKit
BenchWork
File Attached
Building a Node.js Benchmark: Initial Steps
BenchWork
File Attached
In Search of Accurate Benchmarking
BenchWork
File Attached
InspectorClone: Evaluating Precision of Clone Detection Tools
BenchWork
Opening Remarks
BenchWork
Performance Monitoring in Eclipse OpenJ9
BenchWork
Real World Benchmarks for JavaScript
BenchWork
File Attached
The Architecture Independent Workload Characterization
BenchWork
File Attached
Towards a Data-Curation Platform for Code-Centric Research
BenchWork
File Attached

Call for Talks

We welcome contributions in the form of talk abstracts within (but not limited to) the following topics:

  • Experiences with benchmarking in the areas of program-analysis (e.g., finding bugs, measuring points-to sets)
  • Experiences with benchmarking of virtual machines (e.g., measuring memory management overhead)
  • Experiences with benchmarking in the areas of software engineering (e.g., clone detection, testing techniques)
  • Infrastructure related to support of a benchmark over time, across different versions of the relevant programs
  • Metrics that are valuable in the context of incomplete programs
  • Support for dynamic analysis, where the benchmark programs need to be run
  • Automation of creation of benchmarks
  • Licensing issues
  • What types of program should be included in program-analysis benchmarks?
  • What type of analysis do you perform?
  • What build systems do your tool support?
  • What program-analysis benchmarks do you typically use? What are their pros and cons?
  • What are the useful metrics to consider when creating program-analysis benchmarks?
  • How can we handle incomplete code in benchmarks?
  • How can program-analysis benchmarks provide good support for dynamic analyses?
  • How can we automate the creation of program-analysis benchmarks?

You're viewing the program in a time zone which is different from your device's time zone change time zone

Wed 18 Jul

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:30
Real-World BenchmarkingBenchWork at Hanoi
11:00
10m
Opening Remarks
BenchWork
Karim Ali University of Alberta, Cristina Cifuentes Oracle Labs
11:10
30m
Real World Benchmarks for JavaScript
BenchWork
File Attached
11:40
20m
In Search of Accurate Benchmarking
BenchWork
Edd Barrett King's College London, Sarah Mount King's College London, Laurence Tratt King's College London
File Attached
12:00
30m
AndroZoo: Lessons Learnt After 2 Years of Running a Large Android App Collection
BenchWork
Kevin Allix University of Luxembourg
14:00 - 15:30
JavaScript & Dynamic BehaviourBenchWork at Hanoi
14:00
30m
Benchmarking WebKit
BenchWork
File Attached
14:30
20m
Analyzing Duplication in JavaScript
BenchWork
Petr Maj Czech Technical University, Celeste Hollenbeck Northeastern University, USA, Shabbir Hussain Northeastern University, Jan Vitek Northeastern University
14:50
20m
Building a Node.js Benchmark: Initial Steps
BenchWork
Petr Maj Czech Technical University, François Gauthier Oracle Labs, Celeste Hollenbeck Northeastern University, USA, Jan Vitek Northeastern University, Cristina Cifuentes Oracle Labs
File Attached
15:10
20m
A Micro-Benchmark for Dynamic Program Behaviour
BenchWork
Li Sui Massey University, New Zealand, Jens Dietrich Massey University, Michael Emery Massey University, Amjed Tahir Massey University, Shawn Rasheed Massey University
16:00 - 17:40
Software Engineering & CompilersBenchWork at Hanoi
16:00
30m
InspectorClone: Evaluating Precision of Clone Detection Tools
BenchWork
16:30
20m
Towards a Data-Curation Platform for Code-Centric Research
BenchWork
Ben Hermann University of Paderborn, Lisa Nguyen Quang Do Paderborn University, Eric Bodden Heinz Nixdorf Institut, Paderborn University and Fraunhofer IEM
File Attached
16:50
20m
The Architecture Independent Workload Characterization
BenchWork
Beau Johnston Australian National University
File Attached
17:10
30m
Performance Monitoring in Eclipse OpenJ9
BenchWork

BenchWork has some limited funding to support travel, accommodation, or registration for students who are undertaking studies in programming languages or software engineering and want to participate in the workshop. Funding will be available to students who have not received additional travel support.

To apply, please email Karim Ali your name and affiliation, supervisor name, topic of study for your Master’s or PhD, type of funding requested (travel, accommodation, registration) and cost. Application deadline is July 1st.