Research Paper Artifact Evaluation

A few years ago computer science conferences have started with artifact evaluations. This is a post-paper submission process in which authors of accepted papers can send accompanying software, test data, or any digital artifacts to an optional artifact evaluation. The artifacts are usually submitted in the form of virtual machines with everything configured and set up, i.e. in the form of virtual appliances. Then an artifact evaluation committee evaluates the artifacts, as described by this year’s CAV conference:

The Artifact Evaluation Committee (AEC) will read the paper and explore the artifact to give the authors third-party feedback about how well the artifact supports the paper and how easy it is for future researchers to use the artifact.

The overall idea is to encourage building on top of the newly published research, which is usually hard due to lack of artifacts, documentation, etc. Efforts like this one with artifact evaluation make it easier to base a new work on the newly published work and improve the current situation with research reproducibility and repeatability.

Given that some of the artifacts can have substantial resource requirements, such as a lot of processor cores in the virtual machine, memory space that exceeds what is usually available on today’s laptops and desktop machines, there is a need to evaluate the artifacts in a remote environment that meets the resource requirements. In this post I will introduce such an environment and explain how to set it up to run any artifact virtual machine with free software only, regardless of the virtual machine’s intended hypervisor.

The environment I use for artifact evaluation is Emulab. It is a test-bed infrastructure developed and provided by the Flux Research Group at my university. Keep in mind that every researcher can ask for resources at the testbed. I’ve been using Emulab for a couple of years and grew fond with it, though there is a steep learning curve. You don’t necessarily need Emulab to set up an artifact evaluation environment similar to the one described here, but it is handy to have a 32-core 128 GB RAM machine at disposal.

A few conferences I’ve looked at and that have artifact evaluation recommend VirtualBox as the hypervisor. However, given that VirtualBox is an open-core piece of software and that I don’t like open core, I tend to stay away from it and use something else instead. Nevertheless, due to different people using different operating systems, distributing artifacts as VirtualBox-based appliances is a good starting point. This something else that I use is a combination of Linux Kernel-based Virtual Machines or KVM for short, QEMU, libvirt, and Virtual Machine Manager. This enables me to remotely execute any artifact for CAV, regardless of limited resources on my laptop, the laptop being on or off, and the artifact’s intended VM hypervisor.

To make it easier for myself and anyone else interested in evaluating artifacts in this way, I’ve set up a repository with scripts that set up the needed environment. It comes with documentation too. Feel free to skip the rest of the post and dive into the repository and take it from there. On the other hand, you might want to read what follows just to get a glimpse of the setup.

At the heart of the artifact evaluation environment is libvirt. It is a virtualization API and management tool that abstracts away all specifics of the underlying hypervisors. In that way I don’t have to memorize countless commands and command line options that are different from a hypervisor to a hypervisor or getting familiar with the accompanying GUIs. I use Virtual Machine Manager, or virt-manager for short, for a graphical interface to a remote Emulab machine running artifacts with libvirt. It is true that I use only one hypervisor (KVM) which means I could’ve learned how to set up KVM machines from the command line, but I don’t like that business.

You might be wondering how do I convert VirtualBox, VMware, or any other virtual machine to a KVM-based virtual machine. libvirt comes with a tool for the job; it is as simple as running:

sudo virt-convert appliance.ova --noautoconsole

This will convert a VirtualBox or VMware or any other hypervisor machine packaged in the Open Virtualization Format into what libvirt understands and immediately start it with KVM. See the git repository for more details.

I haven’t seen it taking off in the computer science research community yet, but in case your artifact is a Vagrant box, you can also make it work with libvirt so that you keep on using just one interface, even if it’s a VirtualBox-based box. Again, take a look at the repository for more info.

Once you have all of that including artifact virtual appliances set up in the remote environment, you can connect to it from your local machine if you are not comfortable doing all of this from the command line. I assume you run Debian and if you don’t, check out the new Debian GNU/Linux Jessie release that came out yesterday! Just install libvirt and Virtual Machine Manager. Here is what Virtual Machine Manager looks like on my laptop:

As you can see from the picture, with Virtual Machine Manager I monitor virtual machines running not just on Emulab, but on my laptop and on another remote machine.

You can connect to any of the virtual machines through the manager:

A problem I have with this setup is how responsive virtual machine GUIs are. Currently things seem rather slow, in spite of efforts by the Spice project.