[PATCH] D60935: [IndVarSimplify] Fixup nowrap flags during LFTR when moving to post-inc (PR31181)

Nikita Popov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 29 00:34:27 PDT 2019

nikic marked an inline comment as done.
nikic added a comment.

In D60935#1520163 <https://reviews.llvm.org/D60935#1520163>, @reames wrote:

> Ok, after reading through the code and trying out a couple of ideas, I'm fairly sure I understand what's going on here.  There's actually (at least) three distinct bugs in the same logic.  The core thing to keep in mind is that we truncate the IV to the backedge taken type, then perform the math on that type.  The trivial alternative would be to extend both to the same type, but that results in markedly worse codegen.  (Still exploring whether that's fundamental.)
> The first bug is that if we have a backedge taken count of uint_max<T>, then adding 1 results in 0.  This results in the truncated limit instruction simply being the trunc(IV.Start) which while correct for the exit value, is *also* the starting value.  I haven't been able to write a test case to demonstrate this yet.
> The second bug is that once we (correctly) compute the exit value, we rewrite the exit test as IV + 1 Cmp Limit.  The problem is that poison is defined such that *branching on* a poison result is what triggers UB.  As such, we could have had an nsw IV whose post-increment was poison w/o being UB (since we tested against the pre-increment which wasn't).  As such, unless we can prove that (IV+1) isn't poison, we have three choices:
> 1. Don't convert the post increment, but do the pre-increment version only.
> 2. Strip nsw/nuw from the increment.  (This is what the current patch does.)
> 3. Bail out of the transform in this case entirely.
>   The third is the dead IV case mentioned in patch comments.  Good catch, I'd missed that one until rereading the patch again.  Nikita, can you list a specific test case which exercises this?  I'd have thought that hasConcreteDef case would have caught this?

These are the switch_to_different_iv_pos_inc and switch_to_different_iv_pre_inc test cases. I'm assuming you mean AlmostDeadIV here rather than hasConcreteDef. AlmostDeadIV doesn't trigger here because the IV is only dynamically dead, it has static uses.

hasConcreteDef ostensibly is supposed to prevent the problem that this patch deals with in the first place, which is the IV becoming undef. However it does this in a rather naive manner and in particular does not consider poison flags. (And just checking for poison flags would pessimize this a lot -- just because they are present does not mean that it will actually become poison.)

> At the moment, I believe the patch posted is correct, but is unnecessarily conservative.  Instead of only stripping nsw/nuw where necessary, it *always* strips.  A simple, but noteworthy improvement would be to use SCEV's getSignedRange/getUnsignedRange to ensure the increment doesn't overflow and then only stripping on cases where it does.

No, this patch only strips where necessary. It uses the nowrap flag computed by SCEV on the post-inc addrec, which are determined through a number of ways, including both range analysis, as well as nowrap flags on the original IR in conjunction with poison based reasoning. From my experiments this analysis seems to be pretty good, which is also why there is no fallout on existing tests.

See test_no_drop_nuw and test_no_drop_nsw for cases where flags are not dropped based on range analysis in particular, despite a change to post-inc IV.

> I'd also prefer to see us simply refuse to switch to post-increment rather than stripping the nsw/nuw.  That seems like a generally better tradeoff for a middle-end transform.  LSR (or some other late transform) could do the stripping form if needed.  (I doubt it would be.)

Sounds reasonable, I'll give this a try later.

Comment at: llvm/lib/Transforms/Scalar/IndVarSimplify.cpp:2388
+  Value *IncVar = IndVar->getIncomingValueForBlock(L->getLoopLatch());
+  if (auto *BO = dyn_cast<BinaryOperator>(IncVar)) {
+    const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(SE->getSCEV(IncVar));
reames wrote:
> This test can never fail given the definition of LoopCounter used.
This can also be a GEP.



More information about the llvm-commits mailing list