[cfe-dev] [PATCH] Integer Sanitizer Initial Patches

Wed Nov 7 22:47:47 PST 2012

On Wed, Nov 7, 2012 at 5:26 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Wed, Nov 7, 2012 at 5:15 PM, Will Dietz <willdtz at gmail.com> wrote:
>> Hi all,
>>
>> Attached are patches that add a new 'sanitizer' to clang for detecting
>> and reporting integer overfllows.  Unlike the checks added by
>> -fcatch-undefined-behavior, these also include non-undefined-behavior
>> checks.
>
> It's not obvious to me that it's useful to add runtime instrumentation
> for unsigned overflow in particular; my instinct is that you'll get a
> ton of false positives for existing codebases.  Also, there isn't any
> obvious way to fix code which legitimately takes advantage of unsigned
> overflow.  Do you have any data here?
>

This is a good point, and I'm glad you brought it up.  In terms of the
reported overflows being guaranteed bugs, you're certainly right there
will be quite a few false positives.

However I don't think this should be treated as a bug-reporting tool,
but rather a tool to ensure your code is doing what you expect.  This
is important because the rules governing integer behavior are
non-trivial (especially to non-expert programmers, but even to them!),
and often the exact behavior is anything but clear from a glance at a
line of code.  Add in variations across the various C/C++ standards,
unexpected behavior when porting to a new platform (or just
32bit->64bit), and the general lack of attention given to integer
overflow or the exact dynamic range of data and it can be awfully
useful to have a report of what overflows do occur in your program.
Sure, some (many?) of them might be intentional, but that's exactly
why a tool like this is so very useful: it lets you say things like
"my program only overflows an integer where I explicitly expect it
to".  Without such a tool, how do you really know?

Expanding on that, I think a reasonable solution (one that was used by
a previous version of IOC) would be to introduce a function attribute
to allow programmers to flag safe uses of unsigned overflow(hashing,
crypto, RNG, etc).  I believe address sanitizer employs a similar
technique, although perhaps for different reasons.  If it would help,
I can prioritize reviving that code, just let me know :).

As for the data you asked for, it is something we're looking into but
I don't think we're ready to report on it yet.  That said, given the
prevalence of undefined behavior integer bugs[1] in mature commonly
used applications, it seems reasonable to expect that programmers will
make the same kinds of bugs with unsigned integer types.  Additionally
standards like CERT's Secure C coding standard suggest avoiding
unsigned overflow, which does speak for the existence of a community
that would find such a report useful.  There certainly have been CVE's
reported due to unsigned integer overflow :).

In short, I think having tools like provide a useful (and presently,
missing!) component to a developer's arsenal when it comes to
reasoning about their code and ensuring it's doing what is expected.

Most of these arguments also motivate the reporting of lossy casts,
particularly implicit ones.

Hopefully this helps give this some context and addresses some of your
concerns, and I'd be happy to discuss it further.

As a final note I'd like to add that the code impact on Clang is
relatively minor given the existing -sanitize=undefined, which might
be useful for weighing the utility of the checker, although maybe not.

[1] We found undefined integer behavior in the vast majority of
open-source applications we looked at, see our paper: "Understanding
Integer Overflow in C/C++",
http://www.cs.utah.edu/~regehr/papers/overflow12.pdf

>>
>> The attached clang patch adds:
>>
>> -fsanitize=unsigned-integer-overflow
>> and
>> -fsanitize=integer
>
> Suspicious-behavior checkers are fundamentally not the same thing as
> the existing sanitizers because the code might be correct as-is.  Does
> it make sense to put them under -fsanitize?
>

Hmm, this should probably best be answered by someone else.  While
putting it in as a "sanitizer" makes sense to me, one of the largest
reasons I used the sanitizer flags and infrastructure was to leverage
the recent (and not unrelated) Ubsan checking for signed overflow/bad
shifts, which are included in the default 'integer' sanitizer profile.
 However I'm not overly attached to the particular flag interface, so
if there are objections or suggestions for something else I'd be
perfectly willing to work towards that instead.

> -Eli

Thank you for your time,

~Will