[PATCH] D112170: [indvars] Visit all IV users in simplifyAndExtend

Wed Oct 20 12:32:52 PDT 2021

reames created this revision.
reames added reviewers: nikic, lifted, mkazantsev, efriedma.
Herald added subscribers: javed.absar, zzheng, bollu, hiraditya, mcrosier.
reames requested review of this revision.
Herald added a project: LLVM.

The basic idea of this patch is simple - we remove the restriction on only visiting users of addrecs from simplifyAndExtend, and instead allow users of any scevable instruction derived from an IV.  Doing this exposes a couple of subtleties (explained below).

As seen in the test diffs, this does help eliminate some code which wasn't previously handled, but the primary value is making some of the code run in Scalar/IndVarSimplify.cpp after simplifyAndEtend effectively dead.  (To be removed in separate patches with their own review.)

The first subtlety here is that additional simplification can actually lead to inferior results - due to missing invalidation.  The case that happens is we fold an icmp in simplifyAndExtend, but leave the SCEV tripcount for that loop as an could-not-compute.  Following code, specifically predicateLoops, then can't run.  Without this change, we'd instead hit optimizeLoopExits which would invalidate, and thus loop predication would run.

To address this, we must invalidate when we simplify a loop exit condition.  We could just always invalidate, but that could require O(LoopSize) rebuilds of SCEV for the loop.  From prior work on strengthenNoWrapFlags, we know that is sometimes prohibitive.  This patch only invalidates on loop exit condition changes, so the worse case should be O(LoopExits) rebuilds.  In theory, you could have an adversarial loop when number exits ~= loop size, but in practice, the number of loop exits should be much smaller.

@nikic Could you confirm this doesn't have an excessive compile time?

The second subtly shows up in the loop-unroll test change.  In this case, unrolling produces *worse* codegen for an up to date SCEV trip count.  We have a dead exit, which was proven untaken and folded.  With stale info, unroll recognizes that the cached trip count (10) is dead, and folds the branch.  With up to date info (could-not-compute, e.g. never taken), unroll does not look at the actual CFG and leaves the branch unfolded.  This is in my view an uninteresting quirk of the unroll implementation, and not something worth fixing.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D112170

Files:
  llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
  llvm/test/Transforms/IndVarSimplify/X86/pr35406.ll
  llvm/test/Transforms/IndVarSimplify/eliminate-exit-no-dl.ll
  llvm/test/Transforms/IndVarSimplify/infer-poison-flags.ll
  llvm/test/Transforms/IndVarSimplify/strengthen-overflow.ll
  llvm/test/Transforms/IndVarSimplify/zext-nuw.ll
  llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll
  llvm/test/Transforms/LoopUnroll/scevunroll.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D112170.381053.patch
Type: text/x-patch
Size: 15988 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211020/9ffedc0e/attachment.bin>