[PATCH] D142875: [LV] Predicated epilog vectorization

Mon Mar 13 03:25:17 PDT 2023

dmgreen added a comment.

Sorry for the delay - I've had less time than I would like to get back to this. I have updated and rebased the patch. There is still one large MVE issue I need to work through, and the combo of epilog vectorization + DataAndControlFlow is currently not working correctly. I will split that off into another patch though as it is a bit of a more involved change. Plus there is another patch for letting this be controlled by the target or an option.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5653

-  if (!isEpilogueVectorizationProfitable(MainLoopVF)) {
+  if (!isScalarEpilogueAllowed()) {
+    // If we can fold the tail by masking to produce a predicated epilog,
----------------
david-arm wrote:
> Hi @dmgreen, to be honest this logic feels really counter-intuitive to me. At least on the surface, it seems to be saying if we explicitly do not want an epilogue, then ignore it and generate an epilogue. Do you know under which circumstances we would even reach this code?
Yeah because of how we got here, !isScalarEpilogueAllowed is equivalent to mayFoldTailByMasking(). I have updated the logic in this function.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5694
   for (auto &NextVF : ProfitableVFs)
-    if (((!NextVF.Scheme.Width.isScalable() && MainLoopVF.isScalable() &&
+    if (((!NextVF.Scheme.Width.isScalable() && MainLoopVF.Width.isScalable() &&
           ElementCount::isKnownLT(NextVF.Scheme.Width, EstimatedRuntimeVF)) ||
----------------
david-arm wrote:
> I assume we also reach this point with tail-folded schemes and choose the lowest cost out of all the tail-folded and normal vectorized plans?
We currently make the assumption that if we can then we either want a predicated remainder or scalar. i.e an unpredicated vector epliog + scalar will not be better. As the trip count of the vector remainder is expected to be low I think this should be a OK decision to take.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10665
       VectorizationFactor EpilogueVF =
-          CM.selectEpilogueVectorizationFactor(VF.Scheme.Width, LVP);
-      if (EpilogueVF.Scheme.Width.isVector()) {
-
+          CM.selectEpilogueVectorizationFactor(VF.Scheme, LVP);
+      if (!VF.Scheme.FoldTailByMasking && EpilogueVF.Scheme.Width.isVector()) {
----------------
david-arm wrote:
> I think it's better if we can avoid calling this if we don't need to. We may go to a lot of effort to choose a suitable epilogue VF, looking through all the plans, only to just ignore the result completely if we're already tail-folding.
Yep sounds good. This is now handled at the start of selectEpilogueVectorizationFactor.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142875/new/

https://reviews.llvm.org/D142875