[llvm-dev] The undef story

Thu Jun 29 10:38:28 PDT 2017

Hi Peter,

On Thu, Jun 29, 2017 at 8:41 AM, Peter Lawrence
<peterl95124 at sbcglobal.net> wrote:
> Here’s another way to look at it, no one has ever filed a bug that reads
> “I used undefined behavior in my program, but the optimizer isn’t taking advantage of it”
> But if they do I think the response should be
> “you should not expect that, standard says nothing positive about what undefined behavior does"

Of course no one would file such a bug (since if your program has UB,
the first thing you do is fix your program).  However, there are
plenty of bugs where people complain about: "LLVM does not optimize my
(UB-free) program under the assumption that it does not have UB"
(which is what poison allows):

https://bugs.llvm.org/show_bug.cgi?id=28429
https://groups.google.com/forum/#!topic/llvm-dev/JGsDrfvS5wc

> Once we have a self-consistent model for undef, we should be able to fix
> that. The user was confused, however, why seemingly innocuous changes to the
> code changed the performance characteristics of their application. The
> proposed semantics by John, et al. should fix this uniformly.
>
> In any case, to your point about:
>
>   if (a == a)
>     S;
>
>
> I have the same thought. If a == undef here, the code should be dead. Dead
> code must be aggressively dropped to enable inlining and further
> optimization. This is an important way we eliminate abstraction penalties.
> Dead code also has costs in terms of register allocation, speculative
> execution, inlining, etc.
>
>
> And yet  IIRC Sanjoy in his last email was arguing for consistent behavior
> in cases like
> If (x != 0) {
> /* we can optimize in the then-clause assuming x != 0 */
> }
> And in the case above when it is a function that gets inlined
>
> Here’s what Sanjoy said about the function-inline case
>
>> This too is fixed in the semantics mentioned in the paper.  This also
>> isn't new to us, it is covered in section 3.1 "Duplicate SSA Uses".
>
> So this issue seems to be up in the air

This issue is *not* up in the air -- the paper addresses this problem
in the new semantics in the way Hal described: since "if (poison ==
poison)" is explicitly UB in the new semantics, we will be able to
aggressively drop the comparison and everything that it dominates.

> I've also seen cases where templated types are used with fixed-sized arrays
> where the compiler to leveraged knowledge of UB on uninitialized values and
> out-of-bounds accesses to eliminate unnecessary part of the code. In short,
> "optimizing on undefined behavior" can end up being an important tool.
>
>
> As you can tell from my first comments, I am not yet convinced, and would
> still like to see real evidence

I'm not sure why what Hal mentioned does not count as real evidence.
The things he mentioned are cases where "exploiting" undefined
behavior results in less code size better performance.

-- Sanjoy