[PATCH] ScalarEvolution incorrectly assumes that the start of certain add recurrences don't overflow

Sanjoy Das sanjoy at playingwithpointers.com
Sat Feb 7 01:20:23 PST 2015


> 1. For now, let's stick with your strongest interpretation, and assume the local check is sufficient.
> 2. Add a loud WARNING comment explaining that the check is only correct if we assume the overflow cases of nsw are unreachable.

I'd consider this approach somewhat risky -- the rest of LLVM is
allowed to do transforms that break this assumption (i.e. they assume
arithmetic is not side-effecting).  Consider a slightly modified
fragment of the test case with this patch (differences commented):

define i64 @foo(i32 %start, i32 %low.limit, i32 %high.limit, i32* %f) {
 entry:
  %postinc.start = add i32 %start, 1
  br label %loop

 loop:
  %idx = phi i32 [ %start, %entry ], [ %idx.inc, %continue ]
  %postinc = phi i32 [ %postinc.start, %entry ], [ %postinc.inc, %continue ]
  %postinc.inc = add nsw i32 %postinc, 1
  %postinc.sext = sext i32 %postinc to i64
  %break.early = icmp slt i32 %postinc, %low.limit
  store volatile i32 %idx, i32* %f ;; something to hold %idx in place
  br i1 %break.early, label %continue, label %early.exit

 continue:
  %idx.inc = add nsw i32 %idx, 1
  br i1 %break.early, label %loop, label %exit ;; repeated use of %break.early

 exit:
  ret i64 0

 early.exit:
  ret i64 %postinc.sext
}

For the exact same situation / reason as in the test case,
SCEV(%postinc)->getStart() is not nsw.

Now if I run this through 'opt -simplifycfg -loop-rotate' I get

define i64 @foo(i32 %start, i32 %low.limit, i32 %high.limit, i32* %f) {
entry:
  %postinc.start = add i32 %start, 1
  br label %loop

loop:                                             ; preds = %loop, %entry
  %idx = phi i32 [ %start, %entry ], [ %idx.inc, %loop ]
  %postinc = phi i32 [ %postinc.start, %entry ], [ %postinc.inc, %loop ]
  %postinc.inc = add nsw i32 %postinc, 1
  %postinc.sext = sext i32 %postinc to i64
  %break.early = icmp slt i32 %postinc, %low.limit
  store volatile i32 %idx, i32* %f
  %idx.inc = add nsw i32 %idx, 1
  br i1 %break.early, label %loop, label %early.exit

early.exit:                                       ; preds = %loop
  %postinc.sext.lcssa = phi i64 [ %postinc.sext, %loop ]
  ret i64 %postinc.sext.lcssa
}

`L->getExitingBlock() == L->getLoopLatch()` is true for the loop and
we'll trigger the exact same bug even if we add that additional
constraint.  This happens with the optimization removed (i.e. with
this present version of the change applied) so there is no circular
reasoning here.

(Btw, I've assumed Andy really meant `L->getExitingBlock() ==
L->getLoopLatch()` and not `L->getExitBlock() == L->getLoopLatch()`
since AFAICT the latter is trivially false -- the exit block is always
outside the loop while the latch is always inside).

Actually, pretending that a phi consuming poison is unreachable is
also risky since I'd expect loop-rerolling to break that assumption.
While this means there is no good answer at this time, pretending that
a phi consuming poison is UB has the advantage that scev already
assumes this; and we won't be introducing more tricky corner cases.

-- Sanjoy



More information about the llvm-commits mailing list