[PATCH] D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions

Mon Jan 25 02:02:51 PST 2021

fhahn added inline comments.

================
Comment at: llvm/include/llvm/Analysis/TargetTransformInfo.h:1309

+  bool isLegalToVectorizeReduction(RecurKind RecKind, bool Scalable) const;
+
----------------
This should probably have a comment,.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:3924
     EVT VecVT = InVec.getValueType();
+    if (VecVT.isScalableVector())
+      break;
----------------
Can you add a test for this? Also, this seems completely unrelated, can you split it off?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1277
   /// are the selected vectorization factor and the cost of the selected VF.
-  unsigned selectInterleaveCount(ElementCount VF, unsigned LoopCost);
+  unsigned selectInterleaveCount(ElementCount VF, InstructionCost LoopCost);

----------------
those changes could also be submitted separately?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1519

+  bool isLegalWideningOperation(ElementCount VF) {
+    for (auto &Reduction : Legal->getReductionVars()) {
----------------
This also needs a comment. And the name could probably be improved. Maybe `canVectorizeReductions`?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7665

+  if (!CM.isLegalWideningOperation(UserVF))
+    return {{UserVF, InstructionCost::getInvalid()}};
----------------
This should only be checked  in the code handling `UserVF` below? Also, This seems like a property that generally limits to vectorization factor to fixed-width vectorization factors and would be good to check beforehand.

 Would it be possible to just limit vectorization factors to fixed width factors in `computeFeasibleMaxVF`? This way, we won't need extra checks once automatically picked VFs are supported. You'd also won't need any extra code in the caller of `::plan`.

This is similar to how we deal with other 'legality' properties that depend on the vectorization factor, like dependencies that may limit the vectorization factor.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9455
+  if (!VF.Cost.isValid()) {
+    LLVM_DEBUG(dbgs() << "LV: Not vectorizing: The cost-model indicates that "
+                         "vectorization is not possible.\n");
----------------
This message seems a bit odd. I think the cost model should just be responsible for assigning a cost, not deciding whether it is possible to vectorize or not; that's the job of the legality checks. 

Please see my comment above, the could probably done in `computeFeasibleMaxVF`, which technically is part of the cost model, but is the first step and applies other legality constraints as well which limit the vectorization factor.

================
Comment at: llvm/test/Transforms/LoopVectorize/scalable_reductions.ll:9
+; int sum = 0;
+; #pragma clang loop vectorize_width(8, scalable) interleave_count(2)
+;  for (int i = 0; i < n; ++i)
----------------
Personally I don't think the C source code adds much value. The IR is very compact and it should be obvious from the IR & test name what is going on. Also, the IR that clang generates can change, clang options may change, pragmas may change and so on.

================
Comment at: llvm/test/Transforms/LoopVectorize/scalable_reductions.ll:20
+  %cmp6 = icmp sgt i32 %n, 0
+  br i1 %cmp6, label %for.body.preheader, label %for.end
+
----------------
this should not be needed for the test.

================
Comment at: llvm/test/Transforms/LoopVectorize/scalable_reductions.ll:23
+for.body.preheader:                               ; preds = %entry
+  %wide.trip.count = zext i32 %n to i64
+  br label %for.body
----------------
this should not be needed for the test, you can just pass `%n` as `i64`.

================
Comment at: llvm/test/Transforms/LoopVectorize/scalable_reductions.ll:27
+for.body:                                         ; preds = %for.body.preheader, %for.body
+  %indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
+  %sum.07 = phi i32 [ 2, %for.body.preheader ], [ %add, %for.body ]
----------------
nit: can strip `indvars` from the name to mark things more compact.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95245/new/

https://reviews.llvm.org/D95245