[LLVMdev] summer of code idea — checking bounds overflow bugs

Wed Mar 31 10:32:39 PDT 2010

John Regehr wrote:
>> Some checks must live in Clang because too much information has been lost
>> by the time LLVM sees the code.  There are many examples but here is the
>> canonical one.  A program has undefined behavior if "between two sequence
>> points, an object is modified more than once, or is modified and the prior
>> value is read other than to determine the value to be stored."
>>
>> To implement this check in LLVM, we would have to answer the question:
>> Where, in the LLVM code, are the sequence points?
>>     
>
> By the way I can hear readers of this list saying to themselves "this does 
> not seem like a useful check to implement."  Perhaps this is correct, but 
> let's consider the tradeoffs:
>
> - This is a relatively simple, localized check that should not be too hard 
> to implement.
>
> - Almost all of the added checks would be destroyed by LLVM after simple 
> queries to the alias analyzer, so applications running with this check 
> turned on will not slow down much.
>   

I'm not sure if the above is true.  For example, consider the code:

void foo (int * a, int * b) {
    *a = *b++;
}

void bar (int a) {
    foo (&a, &a);
}

I think this is undefined behavior in foo() (two writes within a set of 
sequence points), but it will take inter-procedural alias analysis to 
determine whether the check can be dropped.

Is this correct, or am I missing something?
> - Common optimizing compilers change the meaning of a computation that 
> makes this mistake.
>
> My guess is that this check would find problems in real apps...
>   

That would be an interesting experiment.

-- John T.