[LLVMdev] summer of code idea — checking bounds overflow bugs

John Criswell criswell at uiuc.edu
Tue Mar 30 08:42:00 PDT 2010


易秋萍 wrote:
>
> Hi,
>
> Some days ago I am interested in detecting undefined behaviors
>
> in C programs based on Clang. After several days’ investigation, I think
>
> checking bounds overflow bugs is more interesting, because bounds
>
> overflow is one of the most frequently encountered errors in C programs.
>
> For example, performing pointer arithmetic without checking bounds
>
> can cause bounds overflow. To increase the accuracy of finding bugs,
>
> I want to write several passes, based on slicing, inline and summary
> function
>
> / (partial) transition function, to implement intre-procedural analysis.
>
> Does some person have interest in the project? I need a mentor,
>
> and wait for your reply.
>

The SAFECode project (http://safecode.cs.illinois.edu) is very
interested in having a static array bounds checking analysis pass.
However, we're not interested in a static analysis tool, and we're not
interested in something that works on Clang's IR. What we want is a one
or more LLVM IR analysis passes that tells us which getelementptr (GEP)
instructions in a program are statically known not to overflow (this
allows us to eliminate run-time checks for such instructions).

Such a pass exists in the SAFECode project's source tree, but it has not
been maintained well over the years and has fallen into disuse (it also
had some engineering limitations, such as using exec() to repeatedly
execute the Omega compiler for constraint solving). Either getting that
code to work again or replacing it with something better would be
beneficial.

If you'd be willing to work on something that works with SAFECode, I'd
be willing to mentor your project. I do, however, have one condition:
you need to find a specific static array bounds checking algorithm and
understand how it works. I don't see a specific algorithm or paper
reference above.

A good starting point for algorithms might be Dinakar's paper (see
Section 5): http://llvm.org/pubs/2005-02-TECS-SAFECode.html. There has
also been talk of implementing the ABCD algorithm in LLVM
(http://portal.acm.org/citation.cfm?id=349342); you may want to read
about that algorithm as well.

As an aside, I've written a pass that finds the inter-procedural static
backwards slice of an LLVM value (disclaimer: it uses DSA to find the
targets of indirect function calls). I'm pretty sure we could release it
to the public if you find that you need it for your project.

-- John T.

> Best Reagards!
>
> Qiuping Yi
>




More information about the llvm-dev mailing list