[PATCH] D60935: [IndVarSimplify] Fixup nowrap flags during LFTR when moving to post-inc (PR31181)

Mon May 6 08:38:47 PDT 2019

nikic marked an inline comment as done.
nikic added inline comments.

================
Comment at: llvm/lib/Transforms/Scalar/IndVarSimplify.cpp:2382
+    // whatever it computed here.
+    if (auto *BO = dyn_cast<BinaryOperator>(CmpIndVar)) {
+      const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(SE->getSCEV(CmpIndVar));
----------------
sanjoy wrote:
> I don't think the no-wrap flags on the SCEV expression can be propagated to the increment operation like this. I wrote up some background here: https://www.playingwithpointers.com/blog/scev-integer-overflow.html but in short, say your pre-increment SCEV expression is `{S,+,X}` then the post-inc SCEV expression, `SE->getSCEV(CmpIndVar)` down below, is `{S+X,+,X}`.  Whether this is nsw/nuw has nothing to do with whether `S+X` can overflow -- all it says is that on all but the last iteration of the loop `Add(CmpIndVar,X)` will not overflow where `CmpIndVar` starts from `S+X` and is incremented by `X` on every iteration.
> 
> As a concrete example, say the pre-increment expression is `{-1,+,1}` and let's say the loop body executes 10 times.  Then the post-inc IV is `{0,+,1}`, which is both nuw and nsw (the value of the IV is < 10 and adding 1 to it does not overflow).  But you can't mark the increment operation as nuw because on the very first iteration it computes `Add(-1,1)` which unsigned overflows.
Thanks, your blog post was very useful! You were also right that a nuw flag would be incorrectly added for your `{-1,+,1}` example.

Before moving on to more drastic measures, I'd like to suggest the approach in the updated patch: If I understood correctly, the problem here is basically that using the pre-increment addrec, we don't have a no-overflow guarantee on the last iteration (by semantics), while if we use the post-increment addrec, we don't have a no-overflow guarantee on the first iteration (because it's folded into the start value). However, if we combine overflow flags from both, we can cover the full range, including first and last iteration. Is that correct?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60935/new/

https://reviews.llvm.org/D60935