[LLVMdev] The nsw story

Thu Dec 1 09:32:55 PST 2011

Dan Gohman <gohman at apple.com> writes:

> The presence of this poison value means that undefined behavior has been
> invoked on a potentially speculative code path. The consequences of the
> undefined behavior are deferred until that code path is actually used
> on some non-speculative path.

This is exactly what was done on Itanium and its follow-ons to handle
speculative loads.  It's a perfectly reasonable approach IMHO in its
basic formulation.

The complication you introduced is to go beyond thwe basic formulation
and consider certain kinds of uses "special" in that they don't trigger
undefined behavior.  This leads to the analysis problems you described.

>  - Go back to using undef for overflow. There were no known real-world
>    bugs with this. It's just inconsistent.

I have always considered "undef" to mean "any possible value" so I don't
see a real problem with promoted integers containing a value thhat can't
fit in the original type size.  The value in the original type size is
undetermined by the definition of "undef" so who cares what we choose as
the "real" value?

That said, there are a couple of times I ran into situations where LLVM
assumed "undef" somehow restricted the possible set of values it could
represent.  As I recall I ran into this in instcombine and dagcombine.
I will have to go back and search for the details.  In any event, I
think these things are bugs in LLVM.

So "undef" seems like a reasonable approach to me.

>  - Define add nsw as a fully side-effecting operation, and accept the
>    limits on code motion that this implies. However, as LLVM starts doing
>    profile-guided optimizations, and starts thinking about more diverse
>    architectures, code speculation will likely become more important.

It will indeed.  I don't like this approach.

>  - Define add nsw as a fully side-effecting operation, and teach
>    optimization passes to strip nsw when moving code past control
>    boundaries. This is seen as suboptimal because it prevents subsequent
>    passes from making use of the nsw information. And, it's extra work
>    for optimization passes.

Agreed.

>  - Instead of trying to define dependence in LangRef, just say that if
>    changing the value returned from an overflowing add nsw would
>    affect the observable behavior of the program, then the behavior of
>    the program is undefined.

Isn't that exactly what the C standard says?  Now, LLVM handles more
than C so we need to consider that.  Personally, I agree that this is a
less than satisfying solution.

Are you mainly worried about the promoted value being vcasted back to
the original type?

>  - Give up on nsw and have compilers emit warnings when they are unable
>    to perform some optimization due to their inability to exclude the
>    possibility of overflow.

This would be very bad.  We groan every time we see "unsigned" used as a
loop index because it seriously degrades dependence analysis and
vectorization for exactly this reason.

                               -Dave