[cfe-dev] request for comments on patch: detecting integer undefined behaviors

Eli Friedman eli.friedman at gmail.com
Wed Aug 25 17:46:11 PDT 2010


On Tue, Aug 24, 2010 at 7:22 AM, John Regehr <regehr at cs.utah.edu> wrote:
> Hi Clang folks,
>
> Attached is my student Peng Li's patch to Clang for detecting integer
> undefined behaviors.  It will complain, for example, about C or C++ code
> that evaluates any of:
>
>  -INT_MIN
>  INT_MAX+1
>  2*INT_MAX
>  x>>-1
>  x/0
>
> These behaviors are undefined in all modern C/C++ variants.
> Additionally, C99 and C++0x make lots of harmless-looking signed
> left-shifts into undefined behavior. For example, 1<<31 is undefined
> when sizeof(int)==4 because the result cannot be represented as a signed
> 32-bit quantity. Peng's patch has separate flags for the basic checks
> and for the more aggressive C99/C++0x checks.
>
> It also takes a flag for whether to use explicit checks or LLVM's
> x.with.overflow intrinsics.  The intrinsics (off by default) are faster
> but not as well tested.
>
> We'd like to get this integrated into Clang; please let us know if that
> might happen and if so, what kind of changes we'll need to make.

Generally, I think this is a good idea to integrate into clang;

Some review comments:

1. Do we really need more than one -fcatch-undefined-behavior flag?
It seems like we can key the style of checks off of the current
language selection, and I can't really see any reason for wanting some
checks, but not others.
2. The driver changes for adding -ltrapub to the link line look
strange, but I don't know the code well enough to properly review
them.
3. Trying to do overflow checks with -fwrapv seems strange: the whole
point of -fwrapv is that signed overflow becomes well-defined.
4. Please avoid strncpy if it isn't really necessary; use StringRef
and/or std::string instead.
5. The code for EmitDiv and EmitRem can be refactored together.
6. Please get rid of UndefinedBehaviorAddCheck,
UndefinedBehaviorSubCheck, and UndefinedBehaviorMulCheck; if there are
issues, we'd prefer to fix them in the LLVM backend (I'll try to fix
PR7838 sometime in the near future).
7. "llvm::ConstantInt::get(CGF.Int32Ty, INT32_MAX, true);" is ugly,
there are APInt methods to do the same thing for any bitwidth.
8. Is there a good reason for some operations to trap on finding
undefined behavior, but not others?
9. Bitcasting __ub_trap_handler depending on the check seems bad; it
should have a single, fixed type.
10. The addition of SourceLocation::getLocationValue is unacceptable;
there are already APIs to access all the information, and the API
isn't very good.  In particular, using a char* as an out-parameter is
very unlike other LLVM code.

-Eli




More information about the cfe-dev mailing list