Call for Artifacts
Authors of accepted CGO 2023 papers are invited to formally submit their supporting materials to the Artifact Evaluation (AE) process. The Artifact Evaluation Committee attempts to reproduce (at least the main) experiments and assesses if submitted artifacts support the claims made in the paper. The submission is voluntary and does not influence the final decision regarding paper acceptance.
We invite every author of an accepted CGO paper to consider submitting an artifact. At CGO we follow ACM’s artifact reviewing and badging policy. ACM describes a research artifact as follows:
By “artifact” we mean a digital object that was either created by the authors to be used as part of the study or generated by the experiment itself. For example, artifacts can be software systems, scripts used to run experiments, input datasets, raw data collected in the experiment, or scripts used to analyze results.
We also request the authors to be available through the artifact evaluation period for any technical clarifications. Good Luck with your submissions!
Submission
Submission Site
Artifacts can be submitted to: https://cgo23ae.hotcrp.com/
Submission Requirements
- Authors must submit the paper that has been accepted for publication at CGO, extended with an artifact appendix providing a link to and a description of the artifact.
- The page limit for Artifact Appendix is up to two pages.
- The Artifact Appendix must be placed before the References section.
- The artifact evaluation is single-blind. Please feel free to include author details.
We recommend using this AE appendix template from ctuning.org where you can also find a detailed description of what information to provide.
For the artifact itself, we encourage the use of a container or VM technologies like Docker, Singularity, Virtual Box or Vagrant to package the artifact in one stand-alone container or VM which provides all required dependencies.
If you have an unusual experimental setup that requires specific hardware (i.e., custom hardware, oscilloscopes for measurements …) or proprietary software please contact the Artifact Evaluation Chairs before the submission.
There are more tips for preparing a submission available on the ctuning website.
Evaluation Process
Each submitted artifact is evaluated by at least two members of the artifact evaluation committee.
During the process authors and evaluators are allowed to anonymously communicate with each other to overcome technical difficulties.
Ideally, we hope to see all submitted artifacts to successfully pass artifact evaluation.
The evaluators are asked to evaluate the artifact based on the following criteria, that are defined by ACM.
Is the artifact functional?
- Package complete? Are all components relevant to the evaluation included in the package?
- Well documented? Is the documentation enough to understand, install, and evaluate the artifact?
- Exercisable? Does it include scripts and/or software to perform appropriate experiments and generate results?
- Consistent? Are artifacts relevant to the associated paper and contribute in some inherent way to the generation of its main results?
The artifacts associated with the paper will receive an “Artifacts Evaluated - Functional” badge only if they are found to be documented, consistent, complete, exercisable, and include appropriate evidence of verification and validation.
Is the artifact customizable and reusable?
- Can this artifact and experimental workflow be easily reused and customized?
For example, can it be used on a different platform, with different benchmarks, data sets, compilers, tools, under different conditions and parameters, etc.?
The artifacts associated with the paper will receive an “Artifact Evaluated - Reusable” badge only if they are of a quality that significantly exceeds minimal functionality. That is, they have all the qualities of the Artifacts Evaluated - Functional level, but, in addition, they are very carefully documented and well-structured to the extent that reuse and repurposing are facilitated. In particular, norms and standards of the research community for artifacts of this type are strictly adhered to.
Have the results been validated?
- Can all main results from the paper be validated using provided artifacts?
Evaluators are asked to report any unexpected artifact behavior (depends on the type of artifact such as unexpected output, scalability issues, crashes, performance variation, etc).
The artifacts associated with the paper will receive a “Results replicated” badge only if the main results of the paper have been obtained in a subsequent study by a person or team other than the authors, using, in part, artifacts provided by the author. Note that variation of empirical and numerical results is tolerated. In fact, it is often unavoidable in computer systems research - see “how to report and compare empirical results?” in AE FAQ on ctuning.org!
Based on the results, the following badges are awarded.
Badges
As the ACM recommends we award three different types of badges to communicate how the artifact has been evaluated. A single paper can receive up to three badges — one badge of each type.
Artifacts Available
The green Artifacts Available badge indicates that an artifact is publicly accessible in an archival repository. For this badge to be awarded the paper does not have to be independently evaluated. ACM requires that a qualified archival repository is used, for example, Zenodo, figshare, Dryad. Personal webpages, GitHub repositories or alike are not sufficient as it can be changed after the submission deadline!
The green Artifact Available badge does not require the formal audit and, therefore, is awarded directly by the publisher - if the authors provide a link to the deposited artifact.
Artifacts Evaluated
This badge is applied to papers whose associated artifacts have successfully completed an independent audit.
Artifacts need not be made publicly available to be considered for this badge. However, they do need to be made available to reviewers. Two levels are distinguished, only one of which should be applied in any instance:
Functional
The artifacts associated with the research are found to be documented, consistent, complete, exercisable, and include appropriate evidence of verification and validation.
- Documented: At minimum, an inventory of artifacts is included, and sufficient description provided to enable the artifacts to be exercised.
- Consistent: The artifacts are relevant to the associated paper, and contribute in some inherent way to the generation of its main results.
- Complete: To the extent possible, all components relevant to the paper in question are included. (Proprietary artifacts need not be included. If they are required to exercise the package then this should be documented, along with instructions on how to obtain them. Proxies for proprietary data should be included so as to demonstrate the analysis.)
- Exercisable: Included scripts and/or software used to generate the results in the associated paper can be successfully executed, and included data can be accessed and appropriately manipulated.
Reusable
The artifacts associated with the paper are of a quality that significantly exceeds minimal functionality. That is, they have all the qualities of the Artifacts Evaluated – Functional level, but, in addition, they are very carefully documented and well-structured to the extent that reuse and repurposing are facilitated. In particular, norms and standards of the research community for artifacts of this type are strictly adhered to.
Results Validated & Reproduced
This badge is applied to papers in which the main results of the paper have been successfully obtained by a person or team other than the author. Exact replication or reproduction of results is not required or even expected. Instead, the results must be in agreement to within a tolerance deemed acceptable for experiments of the given type. In particular, differences in the results should not change the main claims made in the paper. In addition, we encourage the authors to consider submitting guiding rules/documentation to support reviewers so they can reproduce the results for new test cases, benchmarks, and applications that have not been submitted to the paper already.