[LLVMdev] summer of code idea — checking bounds overflow bugs

Tue Mar 30 18:19:19 PDT 2010

Sounds an good idea, is that means lowerinng down the SAFECode project
from the higher level(clang)to lower level for an more general work on
bound check? I aslo want to know is it possoble to detecting memory
leak at the very low(llvm ir) level to detecting memory leaks? Or at
llvm ir  level to providing an stackfull hooks? It's very useful to
have such an feature. The stack hooks can help us to print extra stack
info in the exec period without modify the original code, to help us
to find bugs easier:)

I think for memory leaks, we just need to modify the internal method
llvm.malloc, add hooks to this method

For stack hooks, we can modify the calling conventions:)

2010/3/30, John Criswell <criswell at uiuc.edu>:
> 易秋萍 wrote:
>>
>> Hi,
>>
>> Some days ago I am interested in detecting undefined behaviors
>>
>> in C programs based on Clang. After several days’ investigation, I think
>>
>> checking bounds overflow bugs is more interesting, because bounds
>>
>> overflow is one of the most frequently encountered errors in C programs.
>>
>> For example, performing pointer arithmetic without checking bounds
>>
>> can cause bounds overflow. To increase the accuracy of finding bugs,
>>
>> I want to write several passes, based on slicing, inline and summary
>> function
>>
>> / (partial) transition function, to implement intre-procedural analysis.
>>
>> Does some person have interest in the project? I need a mentor,
>>
>> and wait for your reply.
>>
>
> The SAFECode project (http://safecode.cs.illinois.edu) is very
> interested in having a static array bounds checking analysis pass.
> However, we're not interested in a static analysis tool, and we're not
> interested in something that works on Clang's IR. What we want is a one
> or more LLVM IR analysis passes that tells us which getelementptr (GEP)
> instructions in a program are statically known not to overflow (this
> allows us to eliminate run-time checks for such instructions).
>
> Such a pass exists in the SAFECode project's source tree, but it has not
> been maintained well over the years and has fallen into disuse (it also
> had some engineering limitations, such as using exec() to repeatedly
> execute the Omega compiler for constraint solving). Either getting that
> code to work again or replacing it with something better would be
> beneficial.
>
> If you'd be willing to work on something that works with SAFECode, I'd
> be willing to mentor your project. I do, however, have one condition:
> you need to find a specific static array bounds checking algorithm and
> understand how it works. I don't see a specific algorithm or paper
> reference above.
>
> A good starting point for algorithms might be Dinakar's paper (see
> Section 5): http://llvm.org/pubs/2005-02-TECS-SAFECode.html. There has
> also been talk of implementing the ABCD algorithm in LLVM
> (http://portal.acm.org/citation.cfm?id=349342); you may want to read
> about that algorithm as well.
>
> As an aside, I've written a pass that finds the inter-procedural static
> backwards slice of an LLVM value (disclaimer: it uses DSA to find the
> targets of indirect function calls). I'm pretty sure we could release it
> to the public if you find that you need it for your project.
>
> -- John T.
>
>> Best Reagards!
>>
>> Qiuping Yi
>>
>
>

-- 
从我的移动设备发送

         此致
礼
罗勇刚
Yours
    sincerely,
Yonggang Luo