[PATCH] D154264: [LV] Skip VFs < iterations remaining for epilogue vectorization.

Tue Jul 4 09:07:16 PDT 2023

fhahn marked 2 inline comments as done.
fhahn added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5707
       EstimatedRuntimeVF *= *VScale;
   }

----------------
Ayal wrote:
> While we're here, should `EstimatedRuntimeVF` (or rather, `EstimatedMainLoopVF`) be used for both scalable and non-scalable `MainLoopVF` below?
> Are all 4 combinations of main loop and epilog loop VF's being scalable or non-scalable, possible?
> Can the filtering of VF's larger than `EstimatedMainLoopVF` be simplified into a single `ElementCount::isKnownGE(NextVF.Width, EstimatedMainLoopVF)` check for all combinations?
> Is the SCEV `RemainingIterations` check relevant only for the {non-scalable & non-scalable} combination, or worth leaving a TODO?
> While we're here, should EstimatedRuntimeVF (or rather, EstimatedMainLoopVF) be used for both scalable and non-scalable MainLoopVF below?
> Are all 4 combinations of main loop and epilog loop VF's being scalable or non-scalable, possible?
> Can the filtering of VF's larger than EstimatedMainLoopVF be simplified into a single ElementCount::isKnownGE(NextVF.Width, EstimatedMainLoopVF) check for all combinations?

In theory I think yes, all 4 combinations are possible

I tried just checking  `EstimatedRuntimeVF` and not `MainVF` but there are some SVE tests that are failing.

> Is the SCEV RemainingIterations check relevant only for the {non-scalable & non-scalable} combination, or worth leaving a TODO?

I added a TODO. I think for scalable vectors we would also need to estimate the runtime VFs and check those.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5724

+    // If NextVF is less than the number of remaining iterations, the epilogue
+    // loop would be dead. Skip such factors.
----------------
Ayal wrote:
> nit: reverse the condition to use ICMP_UGE as done above?
updated, thanks!

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5679
       return {ForcedEC, 0, 0};
     else {
       LLVM_DEBUG(dbgs() << "LEV: Epilogue vectorization forced factor is not "
----------------
Ayal wrote:
> nit (unrelated): avoid else after return.
Will adjust separately.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5709

-  for (auto &NextVF : ProfitableVFs)
+  ScalarEvolution &SE = *PSE.getSE();
+  const SCEV *TC =
----------------
Ayal wrote:
> nit: can set `Type *TCType = Legal->getWidestInductionType();` and use it below.
Done, thanks!

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5722
+      continue;
+
     if (((!NextVF.Width.isScalable() && MainLoopVF.isScalable() &&
----------------
Ayal wrote:
> Worth early-continuing first if (a) !hasPlanWithVF, then if (b) NextVF >= EstimatedRuntimeVF, and last if (c) NextVF >= RemainingIterations?
> 
> Note that for IC==1 and non scalable VF's, check (c) subsumes check (b).
> 
> Can the checks below be simplified, given that EstimatedRuntimeVF = MainLoopVF if !MainLoopVF.isScalable()?
> Instead of checking if Result.Width.isScalar() better check if Result is still uninitialized, i.e., == VectorizationFactor::Disabled()? Or teach isMoreProfitable() to prefer any computed cost over an uncomputed one.
Applied the first refactoring in b4efc0f070ba, update this patch, thanks! Will do the refactorings mentioned in the last paragraph separately.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154264/new/

https://reviews.llvm.org/D154264