[llvm-dev] RFC: Killing undef and spreading poison

Sanjoy Das via llvm-dev llvm-dev at lists.llvm.org
Thu Jun 8 10:33:59 PDT 2017

Hi Peter,

On Thu, Jun 8, 2017 at 9:41 AM, Peter Lawrence
<peterl95124 at sbcglobal.net> wrote:
>> On Jun 7, 2017, at 2:23 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote:
>> Since most add/sub operations compiled from C have the nsw attribute, we cannot simply restrict movement of these instructions.
> Nuno,
>           I’m not saying the operations can’t be moved,
> I’m saying that once you move one the ‘nsw’ attribute no longer applies,
> unless you can mathematically prove that it still does,
> otherwise an “add nsw” has to be converted to plain “add”.
> It is only by illegally retaining the ‘nsw’ after hoisting the multiply out of the ‘if’ statement
> that subsequent transformations yield end-to-end-miscompilation in Sanjoy’s example.

That would be correct (and we do this for some constructs: for
instance when we have !dereferenceable attached to a load instruction
we will strip the !dereferenceable when hoisting it out of control
flow).  However, with poison we want to have our cake and eat it too
(perhaps eating is not the best analogy with poison :) ) -- we want to
(whenever correct) exploit the fact that a certain operation does not
overflow when possible even when hoisting it above control flow.  For
instance, if we have:

if (cond) {
  t = a +nsw b;

Now if once we hoist t out of the control block:

t = a +nsw b;
if (cond) {

in the transformed program, t itself may sign overflow.  In LLVM IR
(or at least in the semantics we'd like), this has no correctness
implications -- t becomes "poison" (which is basically deferred
undefined behavior), and the program is undefined only if the
generated poison value is used in a "side effecting" manner.  Assuming
that print is a "side effect", this means at print, we can assume t
isn't poison (and thus a + b did not sign overflow).  This is a weaker
model than C/C++; and the difficult bits are getting the poison
propagation rules correct, and to have a sound definition of a "side
effect" (i.e. the points at which poison == deferred UB actually
becomes UB).

> I think the LLVM community in general has misunderstood and misused ‘nsw’, don’t you agree now ?

FYI, I think it is poor form to insinuate such things when you clearly
haven't made an effort to dig back and understand the all of prior
discussions we've had in this area (hint: we've discussed and
explicitly decided to not implement the semantics you're suggesting).
Of course, fresh ideas are always welcome but I suggest you start by
first reading http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf
and some of the mailing list discussions we've had in the past on this

-- Sanjoy

More information about the llvm-dev mailing list