[PATCH] D27919: [Loop Vectorizer] Interleave vs Gather - in some cases Gather is better.

Tue Jan 31 15:07:29 PST 2017

mssimpso added a comment.

Hi Elena,

Here are some more inline comments. Also, please clang-format the patch if you haven't already done so to make the review easier. Thanks!

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:5616-5619
+  // If the memory instruction is in an interleaved group, it will be
+  // vectorized and its pointer will remain uniform.
+  if (isAccessInterleaved(I) || isLegalGatherOrScatter(I))
+    return false;
----------------
delena wrote:
> mssimpso wrote:
> > This is not true now, right? An interleaved access may be scalarized based on the cost model.
> It is still true. Instruction **must ** be scalarized if there is no any other option. This is "legality" check, cost model comes later.
I think the cost vs legality distinction is not important here. My original intent with this function was just to consolidate all the scalarization conditions for a given access. That way it could be called when collecting uniforms, computing costs, and vectorizing and they all would agree on what would happen. Perhaps the choice of name was misleading?

In any case, unless I'm missing something, I don't think you actually use this function anymore. You've replaced all uses with 

```
getWideningDecision(I, VF) == CM_DECISION_SCALARIZE 
```

This makes sense because I think you've moved all the non-cost related scalarization decisions into the inverse: memoryInstructionCanBeWidened. Can this function be deleted now?

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:1930
+  enum InstWidening {
+    CM_DECISION_NONE,
+    CM_DECISION_WIDEN,
----------------
How are we using CM_DECISION_NONE? Aren't we forced to make some sort of decision? I would think the default would be CM_DECISION_WIDEN unless the cost model points to one of the others.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:2054-2055
+  /// vectorized loop, each corresponding to an iteration of the original
+  /// scalar loop. When the \p VF is > 2 we take cost model decisions into
+  /// consideration. 
+  void collectLoopUniforms(unsigned VF);
----------------
The VF > 2 comment can be removed now.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:2062-2063
+  /// represented by VF values in the vectorized loop, each corresponding to an
+  /// iteration of the original scalar loop. When the \p VF is > 2 
+  /// we take cost model decisions into consideration.
+  void collectLoopScalars(unsigned VF);
----------------
The VF > 2 comment can be removed now.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:5503-5504

+  if (VF < 2 || Scalars.count(VF))
+    return;
   // If an instruction is uniform after vectorization, it will remain scalar.
----------------
Since you've added a new check in collectUnifomsAndScalars to ensure the analysis is performed only once, the check here and in collectLoopUniforms is redundant. Should these be asserts now?

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:5556-5563
+bool LoopVectorizationCostModel::hasConsecutiveLikePtrOperand(Instruction *I,
+                                                              unsigned VF) {
+  if (VF > 1 && getWideningDecision(I, VF) == CM_DECISION_INTERLEAVE)
     return true;
   if (auto *Ptr = getPointerOperand(I))
-    return isConsecutivePtr(Ptr);
+    return Legal->isConsecutivePtr(Ptr);
   return false;
----------------
I don't think a see a real use for this function anymore. Please see my related comment about memoryAccessMustBeScalarized. This function was only ever used in collectLoopUniforms to help determine how a memory access would be vectorized. I think you can probably greatly simplify the logic in collectLoopUniforms and remove it. Now, we know what the vectorizer will do based the the cost model decision. 

For a GEP to remain uniform, I think we just need to know that all its users are CM_DECISION_INTERLEAVE or CM_DECISION_WIDEN. Is this right? If so, please work it into the GEP part of collectLoopUniforms.

Repository:
  rL LLVM

https://reviews.llvm.org/D27919