[PATCH] D27919: [Loop Vectorizer] Interleave vs Gather - in some cases Gather is better.

Mon Jan 30 09:09:26 PST 2017

mssimpso added a comment.

Hi Elena,

Thanks for your patience. I haven't yet looked in detail at the widening decision selection, but here are some comments around the uniforms/scalars.

Matt.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:5616-5619
+  // If the memory instruction is in an interleaved group, it will be
+  // vectorized and its pointer will remain uniform.
+  if (isAccessInterleaved(I) || isLegalGatherOrScatter(I))
+    return false;
----------------
This is not true now, right? An interleaved access may be scalarized based on the cost model.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:5625-5626
+void LoopVectorizationCostModel::collectLoopUniforms(unsigned VF) {
+  if (!Uniforms[VF].empty())
+    return;
   // We now know that the loop is vectorizable!
----------------
Can we change this to something like:

```
if (Uniforms.count(VF))
  return;

auto &UniformsVF = Uniforms[VF];

```
We want to distinguish the case that (1) Uniforms have not been computed for VF from (2) Uniforms have been computed for VF but there aren't any, so we don't need to compute them again. We can end up calling this twice for the same VF if we have a user-selected VF and then compute the expected cost for interleaving. This is similar to the way we do this check in collectInstsToScalarize.

This will also apply to collectLoopScalars.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:5687
+      if (!UsersAreMemAccesses || Legal->memoryInstructionMustBeScalarized(&I) ||
+          (VF >= 2 && getWideningDecision(&I, VF) == CM_DECISION_SCALARIZE))
         PossibleNonUniformPtrs.insert(Ptr);
----------------
Why not roll this check into memoryInstructionMustBeScalarized? Either way, I think this check and the check below in isVectorizedMemAccessUse that calls memoryInstructionMustBeScalarized, should be the same.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7007
   // the scalar version.
-  if (Legal->isUniformAfterVectorization(I))
+  if (VF > 1 && isUniformAfterVectorization(I, VF))
     VF = 1;
----------------
The VF > 1 check is not needed because you check that condition in isUniformAfterVectorization.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7022
+void LoopVectorizationCostModel::setCostBasedWideningDecision(unsigned VF) {
+  if (VF <= 1)
+    return;
----------------
This should be VF == 1, since we can't have a zero VF.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7466-7479
   // Insert values known to be scalar into VecValuesToIgnore. This is a
   // conservative estimation of the values that will later be scalarized.
   //
   // FIXME: Even though an instruction is not scalar-after-vectoriztion, it may
   //        still be scalarized. For example, we may find an instruction to be
   //        more profitable for a given vectorization factor if it were to be
   //        scalarized. But at this point, we haven't yet computed the
----------------
Can we just delete this in favor of a helper function that checks VecValuesToIgnore and IsScalarAfterVectorization for a given VF? Something like:

```
bool LoopVectorizationCostModel::shouldIgnoreVecValueInCostModel(Instruction *I, unsigned VF) {
  return VecValuesToIgnore.count(I) || isScalarAfterVectorization(I, VF);
}
```

This way we won't have to be imprecise.

Repository:
  rL LLVM

https://reviews.llvm.org/D27919