[cfe-dev] [PATCH] Integer Sanitizer Initial Patches

Tue Nov 13 07:54:40 PST 2012

I have no opinion about the details of the patch, but I do have a few thoughts about the Integer Sanitizer overall.

First, please consider taking a look at the latest version of  TS 17961 C Secure Coding Rules (latest version is here: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1624.pdf ).  This document establishes requirements for both analyzers and compilers regarding which violations need to be diagnosed. It's not just our (e.g., CERT's) opinion, but rather represents the current consensus of WG14.  There will be a new version posted soon as a result of the Portland meeting.  The portion of the document that is relevant to this patch is the minimum requirements for diagnosing signed overflow and unsigned wrapping; these requirements typically involve tainted inputs. You should probably consider addressing these rules as a minimum target.

CERT's postion is spelled out in rules:
* INT30-C. Ensure that unsigned integer operations do not wrap
* INT32-C. Ensure that operations on signed integers do not result in overflow
which are stricter than TS 17961. These rules would require additional diagnostics to be issues.  You can find these rules in the Secure C coding standard book. Alternatively, feeding the rule ID (e.g., INT30-C) into your favorite search engine will get you to our wiki.

We note that there are good arguments for both positions.  On the one hand, the subset of overflows diagnosed by TS 17961 is likely to contain a higher percentage of security vulnerabilities. On the other hand, *unintended* (and unanticipated) overflow or wrapping produces incorrect results. The additional diagnostics required by the CERT rules would discover some security vulnerabilities that the TS 17961 requirements would miss, a larger number of bugs that lead to incorrect results (but that may not lead to security issues), as well as a number of  "false positives" where wrapping is either intentional or harmless.

My *personal* opinion is that getting integer wrapping semantics by default is a fundamental mistake in C-like languages in general. It's clear that programmers occasionally need modular (wrapping) semantics for math; uses like hash codes, crypto, and address arithmetic spring immediately to mind.  Note, however, that all of these use cases involve *intentional* use of modular arithmetic.  

Now, ask yourself the following question: When was the last time you wrote a program that would get the right answer in the event of *unintended* wrapping? I suggest that the answer is likely to be "never!"  

Modular arithmetic must be available on request, but the *default* should never wrap.  Instead, wrapping or overflow should be considered a bug. There're plenty of better choices than wrapping: you could saturate to maxint or minint; you could take a fault or exception; you could use some integer equivalent of NaN.
Any of these would be a better choice (in terms of semantics).

I realize that this isn't how C-like languages are defined, and that the language standards certainly aren't going to change. So integer sanitizers like this one are likely to be the best available alternative.

Dean F. Sutherland
dsutherland at cert.org

P.S.  Back in the 90s, various commercial compiler vendors demonstrated that the total cost of runtime checking for integer overflow, array bounds violations, and nil-pointer dereferences *combined* can be driven below 10%.  This requires optimization work specifically targeted at soundly removing unneeded checks (which consumes effort that otherwise might be spent on other optimizations), so it certainly isn't free. But it need not be prohibitively expensive for most code.