Overview

**See deadlines on Important Dates –> **

Authors of accepted CC 2023 papers are invited to formally submit their supporting materials to the Artifact Evaluation (AE) process. The Artifact Evaluation Committee attempts to reproduce experiments (in broad strokes) and assess if submitted artifact supports the claims made in the paper. The submission is voluntary and does not influence the final decision regarding paper/poster acceptance.

We invite every author of an accepted CC paper/poster to consider submitting an artifact. It is good for the community as a whole. At CC, we follow ACM’s artifact reviewing and badging policy. ACM describes a research artifact as follows:

“By “artifact” we mean a digital object that was either created by the authors to be used as part of the study or generated by the experiment itself. For example, artifacts can be software systems, scripts used to run experiments, input datasets, raw data collected in the experiment, or scripts used to analyze results.”

Submission Site

The submission site is located at https://cc23ae.hotcrp.com/.

Supporters

The artifact evaluation process is single-blinded. Hence, we kindly request the authors to disable any form of analytics, tracking and logging on the sites and services used to share the artifact with the reviewers. Each submitted artifact is evaluated by at least one member of the artifact evaluation committee.

During the process, authors and evaluators are allowed to anonymously communicate with each other to overcome technical difficulties. Ideally, we hope to see all submitted artifacts to successfully pass the artifact evaluation.

Evaluators are asked to test the functionality and claims associated to the artifact of accepted papers. We will follow the ACM badge award criteria.

ACM recommends awarding three different types of badges to communicate how the artifact has been evaluated. A single paper can receive up to three badges — one badge of each type.

At CC the artifact evaluation committee will award two types of badges:

Red badge (Artifact Usable): Depending on the usability and robustness of the artifact, either the Light-red badge or the Dark-red badge will be awarded. The latter badge will be granted when the artifact far exceeds usability expectations.
Blue badge (Results replicated): If the main paper result can be reproduced with the authors provided artifact.

The Green Artifact Available Badge does not require the formal audit and, therefore, is awarded directly by the publisher — if the authors provide a link to the deposited artifact.

Note that the variation of empirical and numerical results is tolerated. In fact, it is often unavoidable in computer systems research - see “how to report and compare empirical results?” in AE FAQ on https://www.ctuning.org.

Components of the Artifact

The submission version of your paper/poster.
A README file (PDF or plaintext format) that explains your artifact (details below).
The artifact itself, packaged as a single archive file. Artifacts less than 600MB can be directly uploaded to the hotCRP submission site; for archives larger than 600MB, please provide a URL pointing to the artifact; the URL must protect the anonymity of the reviewers. Please use a widely available compressed archive format such as ZIP (.zip), tar and gzip (.tgz), or tar and bzip2 (.tbz2). Ensure the file has the suffix indicating its format. Those seeking the “Available” badge must additionally follow the appropriate instructions recommended by ACM on uploading the archive to a publicly available, immutable location to receive the badge.

General Guidelines

First and foremost, please try to simplify as much as possible the entire artifact evaluation process. Installing the artifact and reproducing any results of the paper are two completely different aspects. For the installation step, providing a “push-button approach” is ideal, that is, a virtual machine or docker image, or scripts that automate the installation of all the dependencies together with the compilation and build commands for an operating system (often Linux). Admin or superuser privileges should not be necessary. Please try to include all the datasets and benchmarks used. If scripts and installation is necessary, it is very possible that we will not ask the evaluators to spend excessive time in this step. Consequently, it is in the best interest of the authors to provide easy instructions. Debugging the installation of the artifact is not the evaluators job.

If the authors wish to apply for the “Results Reproduced” (Blue Badge) and if the experimental test-bed requires specialized hardware, please consider providing open access to a non-tracked testbed that will be used. Unfortunately, we cannot promise an attempt to reproduce the results if the necessary hardware is inaccessible or unavailable to our evaluators.

Submitting your Artifact

Artifacts will be submitted in two phases. First, a Preliminary Artifact Description should be submitted (see deadline). Five days to a week later, the complete artifact must be submitted.

Preliminary Artifact Description (PAD):

Given the short time for evaluating artifacts, CC’23 will request interested authors in submitting the Artifact Description a week before the full artifact. This step is equivalent to registering your artifact, and will help us determine if we have the right set of evaluators and resources. In extreme cases, we will also decline to evaluate an artifact if the required system is too complicated or inaccessible to the evaluating team. It should briefly describe the hardware, software, prerequisites and any additional information that may be used to either select the artifact evaluators and the test-bed. The final artifact description can be revised and improved in the full artifact.

Minimum information to include in the PAD:

Brief description of the artifact: One or two paragraphs describing the artifact. Please don’t rephrase the paper’s abstract. It should give a sense of the type of artifact and evaluators background. For example, an artifact implemented as an LLVM pass to perform datarace detection. List similar tools (helps assigning of artifact to evaluators), and describe the overall behavior of the artifact (inputs and outputs). Any other information that could help on quickly understanding what the artifact does.
Badge you are requesting: Red (evaluated:functional/reusable) and/or Blue (results reproduced).
Hardware requisites: Laptop, regular desktop, workstation, compute-server, GPU(s), FPGAs or something more specialized. Please provide the rough number of compute cores needed and memory (DRAM) required (e.g., 32 GB).
Software pre-requisites: Operating system (Linux, Windows, Mac) and version (e.g. Ubuntu Linux LTS 2020); set of compiler and runtime versions (e.g. GCC 8.1, OpenMP 5.1, CUDA 9.0, etc). Please don’t assume that the evaluators will have CMake, Python or any other tool installed. Your complete artifact should provide this.
Description of your expectations: Approximated time to install, run/use the artifact and (if requested) to reproduce the results.

NOTE: The AE will not invest substantial amount of time or effort on debugging the installation and use process. The information provided will be for internal use.

Submitting the Full Artifact:

As discussed above, only two types of badges will be awarded in this evaluation: one of the two reds (Artifact Evaluated) and/or the blue (Results Reproduced). In the submission form you will state the badges you are applying for: red, blue, or both. Note that the blue badge will also require installing and using the artifact. The installation and execution of experiments has to be vastly stream-lined. Please don’t apply for the blue badge if you are only interested in demonstrating that the artifact is functional (light-red badge) or reusable (dark-red badge). If you select the red badge, the evaluators will make no attempt to reproduce the results.

For the red badges, your artifact must be: Documented, Consistent, Complete and Exercisable. For more details, please see the ACM Artifact Description.

If you are applying for the blue badge, include scripts to execute, gather, and plot the main results of your paper. The README must include a statement or paragraph describing the criteria for interpreting and deeming the results similar enough. Authors can include in the Artifact Description or in a separate PDF an excerpt of the paper with the plot/graph/table attempting to be reproduced.

Other suggestions:

Clearly identify the hardware and software system pre-requisites: number of cores, memory, operating system, compiler used (with version), python version, etc. This helps us also select evaluators. We may decline evaluating an artifact if the pre-requisites are too complicated to satisfy.
Include scripts to perform all the required tasks
Consider using a docker or VM image to simplify all steps. It’s recommended to provide the final artifact via Zenodo, figshare, Dryad or similar archival site.
Describe the steps to follow in the README file.
If applying for the blue badge, briefly and succinctly describe the result being reproduced together with reasonable expectations on possible variations of the results. Such variations can arise owing to the evaluators using a slightly different test-bed than that recommended by the authors.
At least one author should be designated as the point-of-contact (PoC) for possible clarifications. We expect any required clarification to be resolved within 24 hours of making the request. All communication will be done through Hotcrp.

Artifact EvaluationCC 2023