[cfe-dev] [RFC] Extending and improving Clang's undefined behavior checking

Wed Aug 22 09:24:04 PDT 2012

On 8/21/12 6:33 PM, Chris Lattner wrote:
> <catching up on old email>
> On Aug 10, 2012, at 7:48 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>> There are three different (and mostly orthogonal, design-wise) areas where I would like to make improvements to Clang's undefined behavior checking:
> This is truly awesome, I would really love to see this.  Your design proposal is somewhat startlingly similar to an approach I pitched to John and Vikram sometime last year. :-)  I really like separating out the orthogonal pieces and pulling together the various disparate pieces into something coherent.
>
>> 1) Completeness of checks. There are integer undefined behaviors which -ftrapv and -fcatch-undefined-behavior don't catch, and there is almost no checking available for any other undefined behaviors outside of those and the ones caught by {Address,Thread,Memory} Sanitizer.
> Yes.  Nuno's bounds checking work can also be pulled into this eventually, as could stack canaries and "fortify source".

IMHO, I don't think ad-hoc techniques like stack canaries are suitable 
for this particular application.  Stack canaries are better suited for 
run-time protection against stack buffer-overflow attacks (if they're 
suited for anything at all).  I don't think canaries really tell you 
where in the code the stack is being smashed.

For memory-related undefined behaviors, I think it would make sense to 
have various "levels" of checks in which each level adds more overhead 
but checks more things accurately.  ASan would be a good first or second 
level; it finds invalid loads and stores and can catch some 
out-of-bounds array accesses and dangling pointers. Another level could 
be ASan with SAFECode's array checks and points-to set checking.  A 
final level could be something like SoftBound + CETS which provides real 
dangling pointer detection in addition to the previously mentioned checks.

>
>> 2) Command-line interface. We currently have the following options to enable various flavors of undefined
>> I would like for us to have a single argument which enables all undefined behavior checking which can be performed cheaply and with no changes to address space layout or ABI; I propose we extend -fcatch-undefined-behavior to be that argument.
> +1

It may make sense to have an equivalent of the -O option in which each 
level of undefined behavior checking adds more checks.  For example, 
-UD0 is no checks, -UD1 is some really fast checks, -UD2 is more 
stringent checks, etc, etc.

>
>> I think we should support this kind of configuration through a mechanism similar to warning flags: -fcatch-undefined-behavior=c++11 -fno-catch-undefined-behavior=null-reference, for instance.
> +1
>
>> Also, I think we should consider renaming this switch (and/or possibly the -f*-sanitizer switches) to provide a consistent set of names, but I don't have a concrete proposal for that.
> That would make sense to me.
>
>> 3) Handling and reporting of undefined behavior, once caught. The sanitizers produce decent information (which is both useful and detailed), but it could be prettier. The other checks just emit @llvm.trap(), which is far from ideal.
> Yes, this should be orthogonal from the checks and interface, and should be consistent across all the checks.

One feature that you may want is an option to log errors to a file. 
We've found that useful in SAFECode for when we want to benchmark some 
program that has memory safety errors in it.  Valgrind has such an 
option as well, so it seems people find it useful.

-- John T.