[llvm-dev] The nsw story revisited

Peter Lawrence via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 28 09:39:13 PDT 2017


Preface: This paper shows that "poison" was never actually necessary
in the first place. “Poison"s existence is based on incorrect assumptions 
that are being explored for the first time.



I have been re-reading Dan Gohman's original post "the nsw story" [1]
and have come to the conclusion that Dan got it wrong in some respects.

He came up with "no signed wrap" ("nsw") which in his words means
"The purpose of the nsw flag is to indicate instructions which are
known to have no overflow".  The key operative word here is "known".

This means literally an operation with an additional "llvm.assume".
For example    (a +nsw b)    when a and b are i32   is really

  ((llvm.assume((INT_MIN <= ((i64)a+(i64)b) <= INT_MAX) , (a + b))

It is this "assume" that justifies the transformation that Dan does
to optimize sign-extension of i32 induction variables out of loops
on LP64 targets.  So far so good.

Note that there is no "undef" in the IR either before or after the
transform, this doesn't just fall out because of a clever definition
of IR "undef".

Note that even the concept of "undefined" never enters into the 
justification, only that "nsw" ==> "assume" ==> the loop iteration
bounds don't wrap ==> i64 arithmetic will generate the exact same
iterations as i32 arithmetic ==> the induction variable can be
promoted ==> there is no longer any sign-extend inside the loop.

Note that clang can generate "+nsw" for signed “+" regardless
of whether the precise C standard wording is "undefined behavior"
or more simply "unspecified value".



Where Dan goes wrong is in thinking that "+nsw" is an operation
rather than an operation with an assume, and therefore he feels
the need to define what the result of this operation is when it
overflows, which then seems to require a new "poison" instruction
because "undef" isn't good enough.

But there is no need to ask what the result of overflow is
because "+nsw" is like a "+" inside of an if-statement whose
condition precludes overflow, and if it can't overflow then
asking about it is a non sequitor.

And speculatively hoisting the "+nsw" doesn't cause problems
because hoisting a "+nsw" is like taking a "+" outside of the
if-statement that guarantees no overflow, it is then simply
a plain old un-attributed "+" operation which has no undefined
behavior.

Dan's follow on email "nsw is still inconsistent" [2] shows by
example why it is illegal to hoist the "nsw" attribute along 
with the "+" operation.

It therefore makes no sense to discuss the result of "+nsw" as
ever being either "undef" or "poison", and so the need for "poison"
is gone.



Here's what Dan thought at the time about this "poison" creation

        "I wrote up a description of this concept, and it's been in
        LangRef ever since. It sticks out though, because it got pretty
        big and complex, especially in view of its relative obscurity.
        Realistically speaking, it's probably not fully watertight yet."

I agree with Dan here "it's probably not fully watertight yet", and
apparently other folks agree because yet another instruction,
"freeze", is being proposed to fix "poison"s problems.  My guess
is that "freeze is probably not fully watertight yet" either, but
since "poison" isn't needed it is time to delete it from the LangRef,
and we can therefore stop considering "freeze".


Peter Lawrence.


References

[1. llvm-dev, Dan Gohman, Tue Nov 29 15:21:58 PST 2011 ]
[2. llvm-dev, Dan Gohman, Mon Dec 12 12:58:31 PST 2011 ]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170628/87cf349e/attachment.html>


More information about the llvm-dev mailing list