[PATCH] D152366: [LoopVectorize] Allow inner loop runtime checks to be hoisted above an outer loop

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 21 05:35:26 PDT 2023


david-arm added inline comments.


================
Comment at: llvm/lib/Transforms/Utils/LoopUtils.cpp:1669
+        OuterExitCount->getType()->isIntegerTy()) {
+      const SCEV *NewHigh = cast<SCEVAddRecExpr>(High)->evaluateAtIteration(
+          OuterExitCount, *Exp.getSE());
----------------
fhahn wrote:
> Do we need to check if the AddRecs here don't wrap? If it would wrap, could NewHigh < NewLow?
That's an interesting point!

So I've looked into this and in order to even get to this point we must have already called evaluateAtIteration once already to get the original High value. This happens in `RuntimePointerChecking::insert`, which does something similar:

    const SCEV *Ex = PSE.getBackedgeTakenCount();

    ScStart = AR->getStart();
    ScEnd = AR->evaluateAtIteration(Ex, *SE);

I can't see any existing code that worries about ScEnd being < ScStart here and all I'm doing is performing the same evaluation for the outer loop backedge count. It's quite difficult to follow the whole paper trail of complex code in LoopAccessAnalysis, but it seems that AccessAnalysis::createCheckForAccess only worries about overflow if we failed a dependency check and retry the runtime memory checks.

Since this is under a flag and not enabled by default, would you be happy for now with me adding a TODO here to investigate whether wrapping really is a problem or not?


================
Comment at: llvm/lib/Transforms/Utils/LoopUtils.cpp:1650
+      isa<SCEVAddRecExpr>(High) && isa<SCEVAddRecExpr>(Low) &&
+      cast<SCEVAddRecExpr>(Low)->getLoop() == TheLoop->getParentLoop()) {
+    const Loop *OuterLoop = TheLoop->getParentLoop();
----------------
fhahn wrote:
> paulwalker-arm wrote:
> > Given you evaluate `High` using the outer loop's exit count shouldn't you also check the following?
> > ```
> > cast<SCEVAddRecExpr>(High)->getLoop() == TheLoop->getParentLoop()
> > ```
> Could you add a test case for this case?
I can have a look, but the scenario you're thinking about would require 3 levels of nesting. I didn't originally add a test for that because I was worried about introducing so many increasingly complicated negative test cases.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152366/new/

https://reviews.llvm.org/D152366



More information about the llvm-commits mailing list