[PATCH] D88819: [LV] Support for Remainder loop vectorization

Tue Oct 6 08:07:29 PDT 2020

bmahjour requested changes to this revision.
bmahjour added inline comments.
This revision now requires changes to proceed.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3383
+
+         [ ] <-- original loop iteratiib check.
+      /   |
----------------
It's extremely hard to "draw" this diagram in text. It's even harder to read it. I think we should create a documentation section under https://llvm.org/docs/Vectorizers.html#loop-vectorizer and upload an image. The link can then be put into the comment for people to view and understand what is being generated.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:5636
+LoopVectorizationCostModel::selectEpilogVectorizationFactor(unsigned MaxVF) {
+  // Find next VF.
+  unsigned VF = 1;
----------------
The vector epilogue loop's VF need not be smaller than the VF of the original loop for it to be profitable. For example with large interleave counts there may still be significant number of iterations to be executed and the throughput would be affected if a VF is chosen that is smaller than the widest profitable VF.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:5639
+  unsigned int Cost = 0;
+  bool HasMultipleIndvars = (Legal->getInductionVars().size() != 1) ||
+                            (Legal->getReductionVars().size() > 0);
----------------
why this limitation?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:8469
+  // Yet to handle loops with outgoing values.
+  if (!LoopExitBlock->phis().empty())
+    return false;
----------------
Can your code handle first-order or reduction recurrences? Please see `InnerLoopVectorizer::fixCrossIterationPHIs()` and provide a test if they are supported. Otherwise I'm not sure this check is sufficient to catch those cases, specially given that the code guarded by `LB.canCreateVectorEpilog()` does not preserve LCSSA. 

================
Comment at: test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll:238
 ; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[B1:%.*]] = bitcast i32* [[B:%.*]] to i8*
+; CHECK-NEXT:    [[A4:%.*]] = bitcast i32* [[A:%.*]] to i8*
----------------
why these casts are hoisted?

================
Comment at: test/Transforms/LoopVectorize/epilog-loop-vectorize.ll:99
+; CHECK-NEXT:    br i1 [[MEMCHECK_CONFLICT27]], label [[SCALAR_PH10]], label [[VECTOR_PH16]]
+; CHECK:       vector.ph16:
+; CHECK-NEXT:    [[N_MOD_VF28:%.*]] = urem i64 [[WIDE_TRIP_COUNT]], 8
----------------
It would be good to generate more meaningful names for the labels forming the skeleton of the vector epilogue loop. For example `vector.ph` vs `vector.epilogue.ph`, `vector.body` vs `vec.epilogue.body`, etc.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88819/new/

https://reviews.llvm.org/D88819