[PATCH] D79783: [LoopVectorize] Fallback to a scalar epilogue when TP fails
Ayal Zaks via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 19 03:45:12 PDT 2020
Ayal added a comment.
In D79783#2043479 <https://reviews.llvm.org/D79783#2043479>, @Pierre-vh wrote:
> - Changing implementation of the patch following discussion
> - Removed the `ReportFailure` argument of `prepareToFoldTailByMasking`. I don't think it's useful anymore, but feedback is welcome. (The only thing that annoys me is that we now print "loop not vectorized" even when we'll fallback to a scalar epilogue)
Right, as commented earlier, an "analysis" message should be reported when fold-tail doesn't work, instead of a "failure" one.
> - Added a test that makes use of the attribute that enables tail-folding
> - Simplified tests
Would be good to also preserve the existing "force-vector-tail-folding" behavior of "fold or do not vectorize", in addition to introducing the new "prefer-vector-tail-folding" behavior.
================
Comment at: llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h:379
+ SmallPtrSetImpl<const Instruction *> &MaskedOp,
+ SmallPtrSetImpl<Instruction *> &ConditionalAssumes,
+ bool PreserveGuards = false) const;
----------------
(Would have been good to set defaults, but using nullptr's seems cumbersome.)
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp:1264
+ SmallPtrSet<const Instruction *, 8> MaskedOp;
+ SmallPtrSet<Instruction *, 8> ConditionalAssumes;
+
----------------
Rename e.g. with Tmp or FoldTail prefix to distinguish from the non fold-tail sets?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5027
+ "hint/switch and using a scalar epilogue instead.\n");
+ ScalarEpilogueStatus = CM_ScalarEpilogueAllowed;
+ return MaxVF;
----------------
Why change ScalarEpilogueStatus?
================
Comment at: llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll:5
+
+; This test should produce the same result when TP is forced/disabled, because it
+; can't be tail-predicated (due to an outside user of %incdec.ptr in %end),
----------------
"TP is forced" >> "tail folding is preferred"
"tail-predicated" >> "tail folding".
If "tail-predication" is preferred, it should replace "tail folding" throughout. But a tail can be predicated w/o folding it, so FoldTail or in full FoldTailByMasking seems more accurate?
================
Comment at: llvm/test/Transforms/LoopVectorize/use-scalar-epilogue-if-tp-fails.ll:10
+; The first test (@basic_loop) simply relies on the command-line switches.
+; The second test (@metadata) forces tail-folding using metadata.
+; Both tests should always generate a scalar epilogue.
----------------
"forces" >> "specifies fold-tail preference via metadata"
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D79783/new/
https://reviews.llvm.org/D79783
More information about the llvm-commits
mailing list