r254574 - PR17381: Treat undefined behavior during expression evaluation as an unmodeled

Mon Dec 7 18:51:35 PST 2015

On Mon, Dec 7, 2015 at 5:26 PM, Reid Kleckner via cfe-commits <
cfe-commits at lists.llvm.org> wrote:

> It wasn't Chrome, it was some internal dependency on torch-cephes. I
> submitted a patch for it upstream:
>
> https://github.com/deepmind/torch-cephes/commit/9c4a97c90dc200ecbecb883e7230fe3c847954df
> It's not a pretty, though.
>
> I know LLVM IR rules are not C++ rules, but LLVM generally believes in
> NaN. It will speculate things like fdiv above branches, leading to traps
> when people configure the FPU appropriately and then conditionally divide
> by zero. It would be more consistent for Clang to follow the LLVM direction
> here. The fact that C++ says that FP div-by-zero is UB might just be a bug
> in the standard, or a grey area that couldn't be filled in.
>

Right, I'm not suggesting we treat floating-point division by zero as "true
UB". But the fact we have a sanitizer for it means that our behavior is
not, and cannot be, "just give a NaN or Inf". It's a choice of either "give
me a NaN or Inf" or "give me a sanitizer diagnostic, followed by a NaN or
Inf". And that means it's not constant in the general case, because
evaluating it has runtime side-effects.

Prior to this patch, Clang would determine that such divisions had no
side-effects, and speculate them *and the corresponding sanitizer checks*,
resulting in sanitizer false positives on correct code.

> On Mon, Dec 7, 2015 at 2:37 PM, David Majnemer <david.majnemer at gmail.com>
> wrote:
>
>>
>>
>> On Mon, Dec 7, 2015 at 5:25 PM, Hans Wennborg <hans at chromium.org> wrote:
>>
>>> On Mon, Dec 7, 2015 at 2:14 PM, David Majnemer <david.majnemer at gmail.com>
>>> wrote:
>>> >
>>> >
>>> > On Mon, Dec 7, 2015 at 4:32 PM, Richard Smith via cfe-commits
>>> > <cfe-commits at lists.llvm.org> wrote:
>>> >>
>>> >> On Mon, Dec 7, 2015 at 7:25 AM, Joerg Sonnenberger via cfe-commits
>>> >> <cfe-commits at lists.llvm.org> wrote:
>>> >>>
>>> >>> On Thu, Dec 03, 2015 at 01:36:22AM -0000, Richard Smith via
>>> cfe-commits
>>> >>> wrote:
>>> >>> > Author: rsmith
>>> >>> > Date: Wed Dec  2 19:36:22 2015
>>> >>> > New Revision: 254574
>>> >>> >
>>> >>> > URL: http://llvm.org/viewvc/llvm-project?rev=254574&view=rev
>>> >>> > Log:
>>> >>> > PR17381: Treat undefined behavior during expression evaluation as
>>> an
>>> >>> > unmodeled
>>> >>> > side-effect, so that we don't allow speculative evaluation of such
>>> >>> > expressions
>>> >>> > during code generation.
>>> >>> >
>>> >>> > This caused a diagnostic quality regression, so fix constant
>>> expression
>>> >>> > diagnostics to prefer either the first "can't be constant folded"
>>> >>> > diagnostic or
>>> >>> > the first "not a constant expression" diagnostic depending on the
>>> kind
>>> >>> > of
>>> >>> > evaluation we're doing. This was always the intent, but didn't
>>> quite
>>> >>> > work
>>> >>> > correctly before.
>>> >>> >
>>> >>> > This results in certain initializers that used to be constant
>>> >>> > initializers to
>>> >>> > no longer be; in particular, things like:
>>> >>> >
>>> >>> >   float f = 1e100;
>>> >>> >
>>> >>> > are no longer accepted in C. This seems appropriate, as such
>>> constructs
>>> >>> > would
>>> >>> > lead to code being executed if sanitizers are enabled.
>>> >>>
>>> >>> This leads to some pretty annoying regressions as it now seems to be
>>> >>> impossible to use NaN or infinites as constant initializers.
>>> Expressions
>>> >>> like 0.0 / 0.0, 1.0 / 0.0 and -1.0 / 0.0 are perfectly well defined
>>> >>> under normal IEEE rules, so they shouldn't be rejected.
>>> >>
>>> >>
>>> >> Well, we have a problem. The evaluation semantics of these expressions
>>> >> requires code to execute in some build modes (in particular, with
>>> sanitizers
>>> >> enabled), and thus has a side-effect.
>>> >>
>>> >> I'm inclined to relax the restriction added in this change for the
>>> >> specific case of global variables in C, since (as you say) there is a
>>> fair
>>> >> amount of code using divide-by-zero as a "portable" way of generating
>>> an inf
>>> >> or nan.
>>> >>
>>> >>> Worse, it seems
>>> >>> even using __builtin_nan() for example doesn't work.
>>> >>
>>> >>
>>> >> __builtin_nan() works fine for me, can you provide a testcase?
>>> >>
>>> >>> I'm not even sure about the example given in the commit message, how
>>> >>> exactly is that undefined behavior?
>>> >>
>>> >>
>>> >> C11 6.3.1.5/1: "If the value being converted is outside the range of
>>> >> values that can be represented, the behavior is undefined."
>>> >
>>> >
>>> > I don't think we want to make the UB here true UB.  It would mean that
>>> code
>>> > which expected to get NaN might get undef, even outside of constant
>>> > expression evaluation.  The implementation defined behavior of
>>> providing NaN
>>> > seems more friendly...  IIRC, this broke Chrome recently because folks
>>> were
>>> > doing this in C++.  Hans, do you remember the details?
>>>
>>> Hmm, it doesn't ring a bell, but my memory sometimes fails me. Didn't
>>> you and Reid look at something like this the other day? (But maybe
>>> that was in internal code?)
>>>
>>
>> Ah, I think you are right... Reid, do you happen to recall?
>>
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20151207/1473f1a4/attachment-0001.html>