The PLDI conference has an artifact evaluation committee (I’m on the committee) in an effort to make research reproducible. Bidding for PLDI 2016 artifacts has started today, which brought to my attention two problems of anonymity for reviewers and authors. Continue reading
I’ve been writing about reproducible research for some time now as it is a very interesting topic to me. It is the very core of research, hence putting “reproducible” in front of “research” should be the same as just saying “research”, but unfortunately the situation is different. Doing reproducible research is even optional these days. Therefore, I am limited to participating in the rare optional activities.
For the CAV 2015 conference I was in the Artifact Evaluation Committee, which evaluated digital artifacts of papers that got accepted to the conference and that their authors decided to submit in this optional artifact evaluation step. Based on my experience in evaluating a few artifacts, reading reviews of most of the submitted artifacts, and my interest in reproducible research, I provide some thoughts on how to do artifact evaluation in the future and how to make research artifacts useful; I try to provide them in a sorted order of importance, most important things coming first.
My motivation for this post is in pointing to several absurdities in the computer science community, providing suggestions on how to deal with them, fostering a discussion around the absurdities and related issues, and furthermore in making the research done in the field useful to everyone, including to the very researchers conducting research themselves.
A few years ago computer science conferences have started with artifact evaluations. This is a post-paper submission process in which authors of accepted papers can send accompanying software, test data, or any digital artifacts to an optional artifact evaluation. The artifacts are usually submitted in the form of virtual machines with everything configured and set up, i.e. in the form of virtual appliances. Then an artifact evaluation committee evaluates the artifacts, as described by this year’s CAV conference:
The Artifact Evaluation Committee (AEC) will read the paper and explore the artifact to give the authors third-party feedback about how well the artifact supports the paper and how easy it is for future researchers to use the artifact.
The overall idea is to encourage building on top of the newly published research, which is usually hard due to lack of artifacts, documentation, etc. Efforts like this one with artifact evaluation make it easier to base a new work on the newly published work and improve the current situation with research reproducibility and repeatability.
Given that some of the artifacts can have substantial resource requirements, such as a lot of processor cores in the virtual machine, memory space that exceeds what is usually available on today’s laptops and desktop machines, there is a need to evaluate the artifacts in a remote environment that meets the resource requirements. In this post I will introduce such an environment and explain how to set it up to run any artifact virtual machine with free software only, regardless of the virtual machine’s intended hypervisor. Continue reading