[PATCH] D79783: [LoopVectorize] Fallback to a scalar epilogue when TP fails

Tue May 19 03:45:12 PDT 2020

Ayal added a comment.

In D79783#2043479 <https://reviews.llvm.org/D79783#2043479>, @Pierre-vh wrote:

> - Changing implementation of the patch following discussion
> - Removed the `ReportFailure` argument of `prepareToFoldTailByMasking`. I don't think it's useful anymore, but feedback is welcome. (The only thing that annoys me is that we now print "loop not vectorized" even when we'll fallback to a scalar epilogue)

Right, as commented earlier, an "analysis" message should be reported when fold-tail doesn't work, instead of a "failure" one.

> - Added a test that makes use of the attribute that enables tail-folding
> - Simplified tests

Would be good to also preserve the existing "force-vector-tail-folding" behavior of "fold or do not vectorize", in addition to introducing the new "prefer-vector-tail-folding" behavior.

================
Comment at: llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h:379
+                            SmallPtrSetImpl<const Instruction *> &MaskedOp,
+                            SmallPtrSetImpl<Instruction *> &ConditionalAssumes,
+                            bool PreserveGuards = false) const;
----------------
(Would have been good to set defaults, but using nullptr's seems cumbersome.)

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:1264
+  SmallPtrSet<const Instruction *, 8> MaskedOp;
+  SmallPtrSet<Instruction *, 8> ConditionalAssumes;
+
----------------
Rename e.g. with Tmp or FoldTail prefix to distinguish from the non fold-tail sets?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5027
+                         "hint/switch and using a scalar epilogue instead.\n");
+    ScalarEpilogueStatus = CM_ScalarEpilogueAllowed;
+    return MaxVF;
----------------
Why change ScalarEpilogueStatus?

================
Comment at: llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll:5
+
+; This test should produce the same result when TP is forced/disabled, because it
+; can't be tail-predicated (due to an outside user of %incdec.ptr in %end),
----------------
"TP is forced" >> "tail folding is preferred"
"tail-predicated" >> "tail folding".

If "tail-predication" is preferred, it should replace "tail folding" throughout. But a tail can be predicated w/o folding it, so FoldTail or in full FoldTailByMasking seems more accurate?

================
Comment at: llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll:10
+; The first test (@basic_loop) simply relies on the command-line switches.
+; The second test (@metadata) forces tail-folding using metadata.
+; Both tests should always generate a scalar epilogue.
----------------
"forces" >> "specifies fold-tail preference via metadata"

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79783/new/

https://reviews.llvm.org/D79783