Unit Testing, cgroups, and Confirming Bugs

Posted on June 25, 2016 by Marko Dimjašević

It’s been another week of my Google Summer of Code project on integrating KLEE into Debile. Last week I started with unit testing a library for Firehose in KLEE, but I got stuck due to a bug in the llvm-3.4-tools package for Debian (LLVM 3.4.2). I was able to resume this week. Other than that, I learned about cgroups and I have been looking at memory errors I stumbled upon in Debian packages last week.

There was a bug in the Debian package for LLVM 3.4 tools. The bug somehow didn’t get fixed, even though Martin Nowack, who’s also been working on KLEE, submitted a patch when the current LLVM 3.4 release was 3.4.1. Martin shared with me his patch, thanks to which I was able to proceed with unit testing my Firehose library. I wrote over 100 unit tests for the library. These have been useful to have. I have more than you can see in the linked tree, but I haven’t committed them yet as I’m getting a frustrating segmentation fault, and I can’t figure out what’s happening. Lately C++ has been giving me grievances.

I’ve been working on integrating the Firehose library into the rest of the KLEE code base. In doing so, I stumbled upon a few unanticipated ways of using the Firehose library so I extended the library accordingly. I believe I’m close to being done with this one, but it needs some more debugging and testing. Therefore, I haven’t committed my changes to the usual repository, but to a temporary repository; you can take a look if you’re interested. Anyhow, I can get a partial Firehose output file out of KLEE that looks like this:

<issue>
  <message>Error: memory error: out of bound pointer
  The error occurs when the program is executed with the following arguments: --sym-arg3
  </message>
  <trace>
    <state>
      <location>
        <file given-path="/home/marko/program-1.0/program.c"/>
        <function name="main"/>
        <point column="0" line="504"/>
      </location>
      <notes>Call to function: main(argc=3, argv=27379120)</notes>
    </state>
    <state>
      <location>
        <file given-path="/home/marko/program-1.0/program.c"/>
        <function name="set_path"/>
        <point column="0" line="189"/>
      </location>
      <notes>Call to function: set_path(type=0, name=16675968)</notes>
    </state>
  </trace>
</issue>

This XML snippet is based on a real bug, but I edited this XML snippet by changing the program’s real name and function calls because I’m in the middle of a bug report for a corresponding Debian package for which I’m not sure how serious it is. Nevertheless, you can see this is in the Firehose format. What is missing are results and analysis elements of the format, but I’m getting there. Also, I have to figure out how to give actual argument values on line 3 instead of --sym-arg3 (this is a syntax that tells KLEE to analyze the program with a symbolic argument of size 3 bytes).

Last week I mentioned how I will probably need to use control groups, a Linux kernel feature, to tame KLEE’s memory usage. I haven’t tested it yet, but some commits to the KLEE code base from about a month ago might solve the problem in KLEE itself. If not, I’ll use cgroups, which I took a look at and now I understand how the control groups thing works. I managed to set it up on my laptop to cap memory usage for web browsers, so it will be easy to do the same for KLEE.

Another thing I wrote about in the last report is about 35 memory bugs I found in Debian packages with KLEE. I still haven’t confirmed them as they’re in sizable code bases. Cristian Cadar said he might take a look so I shared the errors with him. On the other hand, as written above, we do have at least one definite bug in a Debian source package. I believe it might be a serious issue, so before disclosing it publicly, I decided to send an  email to the Debian security team. Still no reply from them, even though it’s been a few days.

As usual, I read a research paper related to this project. There was a project called Cloud9 from EPFL, Switzerland and there is a paper about it: Parallel Symbolic Execution for Automated Real-world Software Testing. Cloud9 builds on KLEE. One key difference is that Cloud9 is a distributed system, running on multiple processors and machines, giving roughly a linear speed-up in the number of available machines/processors. That is very interesting as symbolic execution can take a while. Another cool thing about Cloud9 is that it supports more of the POSIX standard than KLEE does. To the best of my knowledge, these additional POSIX features supported in Cloud9 haven’t been ported back to KLEE. Just to confirm it, I asked on the KLEE developers’ list, but there is no reply so far. According to the commit history in its Git repository, Cloud9 seems to be a discontinued project.