[llvm-dev] RFC: Killing undef and spreading poison
Peter Lawrence via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 8 10:52:52 PDT 2017
> On Jun 8, 2017, at 10:33 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>
> Hi Peter,
>
> On Thu, Jun 8, 2017 at 9:41 AM, Peter Lawrence
> <peterl95124 at sbcglobal.net> wrote:
>>
>>> On Jun 7, 2017, at 2:23 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote:
>>>
>>> Since most add/sub operations compiled from C have the nsw attribute, we cannot simply restrict movement of these instructions.
>>
>> Nuno,
>> I’m not saying the operations can’t be moved,
>> I’m saying that once you move one the ‘nsw’ attribute no longer applies,
>> unless you can mathematically prove that it still does,
>> otherwise an “add nsw” has to be converted to plain “add”.
>>
>> It is only by illegally retaining the ‘nsw’ after hoisting the multiply out of the ‘if’ statement
>> that subsequent transformations yield end-to-end-miscompilation in Sanjoy’s example.
>
> That would be correct (and we do this for some constructs: for
> instance when we have !dereferenceable attached to a load instruction
> we will strip the !dereferenceable when hoisting it out of control
> flow).
In other words you are agreeing with me.
And once we’ve agreed on that, why do you insist on illegally hoisting the ‘nsw’
Out of the if-statement, since adding “poison” is clearly not necessary (once
The ‘nsw’ is stripped off the multiply the ‘sext’ can no longer be commuted).
In other words “poison” isn’t cake, it is a bandaid over an illegal transformation,
It has no benefit. Don’t make the illegal transformation and ‘poison’ isn’t necessary.
Peter Lawrence.
> However, with poison we want to have our cake and eat it too
> (perhaps eating is not the best analogy with poison :) ) -- we want to
> (whenever correct) exploit the fact that a certain operation does not
> overflow when possible even when hoisting it above control flow. For
> instance, if we have:
>
> if (cond) {
> t = a +nsw b;
> print(t);
> }
>
> Now if once we hoist t out of the control block:
>
> t = a +nsw b;
> if (cond) {
> print(t);
> }
>
> in the transformed program, t itself may sign overflow. In LLVM IR
> (or at least in the semantics we'd like), this has no correctness
> implications -- t becomes "poison" (which is basically deferred
> undefined behavior), and the program is undefined only if the
> generated poison value is used in a "side effecting" manner. Assuming
> that print is a "side effect", this means at print, we can assume t
> isn't poison (and thus a + b did not sign overflow). This is a weaker
> model than C/C++; and the difficult bits are getting the poison
> propagation rules correct, and to have a sound definition of a "side
> effect" (i.e. the points at which poison == deferred UB actually
> becomes UB).
>
>> I think the LLVM community in general has misunderstood and misused ‘nsw’, don’t you agree now ?
>
> FYI, I think it is poor form to insinuate such things when you clearly
> haven't made an effort to dig back and understand the all of prior
> discussions we've had in this area (hint: we've discussed and
> explicitly decided to not implement the semantics you're suggesting).
> Of course, fresh ideas are always welcome but I suggest you start by
> first reading http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf
> and some of the mailing list discussions we've had in the past on this
> topic.
>
> Thanks!
> -- Sanjoy
More information about the llvm-dev
mailing list