[llvm-dev] the nsw story, revisited

Wed Jun 14 15:13:46 PDT 2017

I guess an important thing to realize is that poison flows throughout the 
data-flow graph; it's not something that you can control locally.  Simply 
removing nsw from an operation doesn't mean that its result will be 
non-poisonous.
Right now there's no way to stop propagation of poison; freeze adds that 
functionality.

Imagine you want to hoist an 'add nsw', drop nsw, and then do loop 
unswitching (your suggestion):

f(a, b) {
  while (..) {
    if (a +nsw 1 > b) { S } else { T }
  }
}

=>

f(a, b) {
  // drop nsw
  if (a + 1 > b) {
     while (...) S;
  else
     while (...) T;
  }
}

Is this correct? No!  Dropping nsw doesn't give any guarantee.  To see why, 
imagine that function f is now inlined in g:
g() {
  f(poison, poison);
}

after inlining:
g() {
  if (poison) ...
}

We've just introduced a branch on poison when the loop condition is false. 
Since we've decided to make branch-on-poison UB, the transformation we did 
above is incorrect.  Dropping nsw doesn't help. You need a way to stop 
poison from being propagated.  LLVM currently doesn't have this mechanism.

Nuno

P.S.: Please fix your address book entry for my email address; a colleague 
of mine is receiving your emails not me.

-----Original Message----- 
From: Peter Lawrence via llvm-dev
Sent: Wednesday, June 14, 2017 9:23 PM
To: llvm-dev
Subject: Re: [llvm-dev] the nsw story, revisited

John,

Sanjoy,
Nuno,
David,
         Thanks for the tip, below are the relevant posts from the archives.

I am suggesting something similar to Dan's third option below (Tue Nov 29 
2011
"the nsw story”, Dan Gohman), when hoisting an instruction with ‘nsw’ it 
looses
the ‘nsw’ attribute, but without saying “add-nsw is a fully side-effecting
instruction”.

This option was back then seen by Dan as too much effort, but the current
proposal requires at least the same amount of effort. To be specific the
current proposal requires adding “freeze” instead of removing “nsw”. It is
pretty much the same amount of editing work on the compiler sources either
way as far as add/sub/mul/shl are concerned.

The difference is that removing “nsw” from an operation is simpler and 
cleaner
than adding “freeze” to it, and in software engineering we are always 
striving
for the simplest and cleanest solution.

IIUC option three does not inhibit Dan's original goal of promoting 32-bit
induction variables to 64-bit on an LP64 target. And I don’t agree with 
Dan's
description of this being “suboptimal” for any other reason, the lost
information does not seem to be useable AFAICT, but if you disagree then 
simply
insert an “llvm.assume" in the spot where the operation is being hoisted 
from.

So I propose that we revisit this option.  It was never fully explored in 
Dan’s
original email, nor in any subsequent emails, and now especially it seems to
deserve it.

Thoughts ?
Comments ?
Questions ?

Peter Lawrence.

2011: 
=========================================================================

------------------------------------------------------------------------
Mon Dec 12 2011  "nsw is still logically inconsistent" (Dan Gohman)

# Dan provides this example
# i32 a,b; i64 c,d,e,f;
#
# if (cond) {                // assume cond prevents signed wrap
#   c = sext64( a +nsw b )   // upper 33-bits all 1's or all 0's
#   d = (c >> 31) + 1        // 0 or 1 result
#   e = d <u 2               // 1 result
#   f = 1 / e                // no possible divide-by-zero
# }
# the problem is when the compiler tries to speculatively hoist everything
# out of the if-cond. now when (a +nsw b) does overflow wrap, the
# promotion of a and b to i64 means sext64(a) + sext64(b) can evaluate
# to something whose upper 33-bits aren't all 1's or all 0's and we can
# end up with divide-by-zero.
#
# (remember that nsw was introduced to allow promotion to i64 to
# allow eliminating sext64 on LP64 targets, they haven't been
# eliminated in this example because the actual goal is just to
# hoist the sext64 out of a loop which isn't shown for simplicity)

# ===> the thread is left unresolved, whatever llvm is actually
#      doing to resolve the problem isn't documented here <===

------------------------------------------------------------------------
Thu Dec  1 2011  "the nsw story" --- continued

# Dan shows this argument against option 1 (“undef" as alt to "poison")
# but its still not clear how "poison" solves the problem

int a = INT_MAX, b = 1;
long c = (long)(a + b);

What is the value of c, on an LP64 target?

By standard C rules, the add overflow invokes immediate undefined behavior 
of course.

If a and b are promoted to 64-bit, c is 0x0000000080000000.

In a world where signed add overflow returns undef, c could be any of
  0x0000000000000000
  0x0000000000000001
  0x0000000000000002
  ...
  0x000000007fffffff
or
  0xffffffff80000000
  0xffffffff80000001
  0xffffffff80000002
  ...
  0xffffffffffffffff
however it can't ever be 0x0000000080000000, because there's no 32-bit value
the undef could take which sign-extends into that 64-bit value.

Therefore, if signed add overflow returns undef, promoting such 32-bit 
variables to
64-bit variables is not a behavior-preserving transformation.

------------------------------------------------------------------------
Tue Nov 29 2011  "the nsw story" (Dan Gohman)

# starts by describing sign-extension on LP64 targets as the
# motivation for "nsw", in that in many situations a 32-bit add-nsw
# can be promoted to a 64-bit add, eliminating the sext operation,
# or at least hoisting it out of a loop where it can be expensive

# could also be titled "the poison story",
# because it ends with this list of alternatives to "poison"
# none of which seem to be acceptable to Dan, but I'm still
# not sure how "poison" solves the LP64 sext problem

# here are Dan's alternatives to "poison"

A natural reaction to this problem is to think that LLVM IR is so nice
and pretty that naturally there must be a nice and pretty solution. Here
are some alternatives that have been considered:

- Go back to using undef for overflow. There were no known real-world
   bugs with this. It's just inconsistent.

- Define add nsw as a fully side-effecting operation, and accept the
   limits on code motion that this implies. However, as LLVM starts doing
   profile-guided optimizations, and starts thinking about more diverse
   architectures, code speculation will likely become more important.

- Define add nsw as a fully side-effecting operation, and teach
   optimization passes to strip nsw when moving code past control
   boundaries. This is seen as suboptimal because it prevents subsequent
   passes from making use of the nsw information. And, it's extra work
   for optimization passes.

- Instead of trying to define dependence in LangRef, just say that if
   changing the value returned from an overflowing add nsw would
   affect the observable behavior of the program, then the behavior of
   the program is undefined. This would reduce the amount of text in
   LangRef, but it would be a weaker definition, and it would require
   sign-extension optimizers and others to do significantly more work
   to establish that their transformations are safe.

- Give up on nsw and have compilers emit warnings when they are unable
   to perform some optimization due to their inability to exclude the
   possibility of overflow. Obviously the warnings would not be on by
   default, or even -Wall or probably even -Wextra, so -Werror users need
   not revolt. Many people are often faced with code that they cannot
   modify for any number of reasons, and would not be able to make use
   of such warnings. It's an interesting tradeoff, but it's unpopular.

Unfortunately, none of these are completely nice and pretty. Because of
this, and because most people don't care, nsw with all its conceptual
complexity has survived.

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev