[PATCH] D142015: [LV] Plan with and without FoldTailByMasking

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 4 08:47:15 PDT 2023


david-arm added a comment.

Hi @dmgreen, I'm starting to review this patch again, but it's quite large so it might take a while. Just a quick skim through so far I thought of a few ideas about some NFC patches that could be useful in their own right and might help to reduce the diff, if you agree?



================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5289
 
-std::optional<unsigned> LoopVectorizationCostModel::getVScaleForTuning() const {
+/// Convenience function that returns the value of vscale_range iff
+/// vscale_range.min == vscale_range.max or otherwise returns the value
----------------
Some changes like this don't really need to be in this patch, right?


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5293
+static std::optional<unsigned>
+getVScaleForTuning(const Loop *L, const TargetTransformInfo &TTI) {
+  Function *TheFunction = L->getHeader()->getParent();
----------------
Can't we just pass in `TheFunction` instead?


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5349
+  if (A.FoldTailByMasking && !B.FoldTailByMasking)
+    return (CostA * EstimatedWidthB) <= (CostB * EstimatedWidthA);
+
----------------
Again, I'm a bit concerned about the effect this will have when the active lane mask call is not costed into the loop. I'd prefer for now to be conservative and opt for the non-predicated scheme until we've accounted for the IV costs.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5749
 unsigned
-LoopVectorizationCostModel::selectInterleaveCount(ElementCount VF,
-                                                  InstructionCost LoopCost) {
+LoopVectorizationCostModel::selectInterleaveCount(VectorizationFactor VF) {
   // -- The interleave heuristics --
----------------
Again, it looks like the change in prototype doesn't need to be part of this patch and might be a useful tidy-up NFC patch. 


================
Comment at: llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll:973
 ; CHECK-LABEL: LV: Checking a loop in 'merge_with_dead_gep_between_regions'
-; CHECK:      VPlan 'Initial VPlan for VF={2},UF>=1' {
+; CHECK:      VPlan 'Initial VPlan for Tail Folded VF={2},UF>=1' {
 ; CHECK-NEXT: Live-in vp<[[VEC_TC:%.+]]> = vector-trip-count
----------------
These debug output changes look useful by themselves outside of this patch - not sure if it's possible to pass in the `FoldTailByMasking` flag in a separate patch?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142015/new/

https://reviews.llvm.org/D142015



More information about the llvm-commits mailing list