[LLVMdev] nsw is still logically inconsistent

Wed Dec 14 09:11:31 PST 2011

Dan Gohman <gohman at apple.com> writes:

> Next, we perform a promotion transformation, converting the add nsw
> from i32 to i64:
>
>   %s0 = sext i32 %a to i64
>   %s1 = sext i32 %b to i64
>   %t0 = add nsw i64 %s0, %s1
>   %t2 = ashr i64 %t0, 31
>   %t3 = add i64 %t2, 1
>   %t5 = icmp ult %t3, 2
>   %t6 = udiv i1 1, %t5
>   br i1 %overflow_check, label %no_overflow, label %end
>
> no_overflow:
>
> Was this valid?
>
> Any time the new i64 add would produce a different value than the
> original sext would have, it would be a case where the 32-bit add
> had an overflow. The nsw says that the program would have undefined
> behavior in that case if the result is used, so this should be ok.
>
>
> Unfortunately, the final code is actually broken. If adding %a
> and %b as i32 values would overflow, adding them as i64 values
> would produce a value for which bit 31 is not equal to bit 32, so
> the subsequent ashr would produce a value which is not -1 or 0, the
> subsequent add would produce a value which is not 0 or 1, and the
> icmp ult would return 0, and the udiv would have undefined
> behavior. And it's unconditional.

I'm not following.  If the promotion to i64 produces a different value,
then the nsw smeantic was violated, leading to undefined behavior, as
you note.  That that point all bets are off.  Divide by zero certainly
is a perfectly valid expression of undefined behavior.  If we had a
delayed check we would have to put it somewhere before the udiv.  We
would probably need some kind of fixup path _a_la_ IA64's check
instruction.

The whole point of this is to be able to hoist nsw operations.  But any
form of static speculation will require fixup code in the cases where
the speculation is wrong.  There really is a cost to hoisting code with
potential side-efffects across branches.  I don't see any way to get
around that.

                               -Dave