[llvm-dev] Improving SCEV's behavior around IR level no-wrap flags
Sanjoy Das via llvm-dev
llvm-dev at lists.llvm.org
Fri Sep 23 19:36:20 PDT 2016
Hi Christof,
Christof Douma wrote:
> As far as I understand this, the reasoning is that in SCEV the flags
> must be interpreted as UB/not-allowed to happen because of SCEV
> expressions can represent instructions in different contexts (e.g. in
> different execution paths). In the IR these flags mean poison. This is
> a semantical mismatch that gives a big proof burden on SCEV.
>
> To give an example (test/Analysis/ScalarEvolution/nsw.ll - addnsw):
>
> 01: entry:
> 02: %tmp = add i32 %a, %b
> 03: %cmp = icmp sgt i32 %tmp, 0
> 04: br i1 %cmp, label %greater, label %exit
> 05: greater:
> 06: %tmp2 = add nsw i32 %a, %b
> 07: br label %exit
> 08: exit:
> 09: %result = phi i32 [ %a, %entry ], [ %tmp2, %greater ]
> 10: ret i32 %result
>
> In the following example line 2 and 6 are mapped to a single SCEV
> expressions: "(%a + %b)”. The SCEV version of ‘NSW' does not hold
> because at line 2 the instruction might actually wrap around.
>
> Up till now I had not realised that sharing of SCEV expressions was
> the only reason for SCEV to have this different semantics.
Yeah, that's how poison is intended to work. The branch adds nothing,
you have the same problem with
> 01: %tmp = add i32 %a, %b
> 02: %cmp = icmp sgt i32 %tmp, 0
> 03: %tmp2 = add nsw i32 %a, %b
However, lack of context sensitivity does mean that in
%t0 = add i32 %a, 1
if (%a s< 100)
%t1 = add i32 %a, 1
SCEV cannot mark the expression for %t1 as nsw, even though that
addition won't sign-overflow; since, as you said, both %t0 and %t1
will be mapped to the same SCEV expression, in the current scheme as
well as in the new scheme I'm proposing.
In the new scheme, there is a solution for cases like the above: teach
instcombine or LVI etc. to transform that IR to
%t0 = add i32 %a, 1
if (%a s< 100)
%t1 = add nsw i32 %a, 1
and then let the normal logic (that will be newly added in SCEV) kick
in.
> I can’t help to ask. Why not define a wrapping nsw instruction as
> UB, instead of “delayed UB” aka poison? I believe we have the notion
> of poison solely to ease the movement of instructions. In my example
> above line 6 can be moved to line 2 due to poison. If the wrapping was
> UB such a move is still possible but would require dropping the nsw
> flag, nothing more. If this information loss is considered a problem,
> the use of llvm.assume can counter that.
>
> Has somebody ever looked at not having poison at all? What are the
> downsides of that?
That's just the way things are at the moment.
We _could_ change that (and make overflowing nsw/nuw operations
undefined behavior) but since that is a very deep change to LLVM IR
(basically arithmetic will be sometimes non-speculatable), it has a very
high bar for acceptance.
-- Sanjoy
More information about the llvm-dev
mailing list