[PATCH] D27919: [Loop Vectorizer] Interleave vs Gather - in some cases Gather is better.

Matthew Simpson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 2 14:00:06 PST 2017


mssimpso added a comment.

Hi Elena,

I like the direction this patch is going. Thanks for all your work. Here are some more comments inline.



================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7053
+
+        if (InterleaveGrpCost / Group->getNumMembers() <= 1) {
+          // The interleaving cost is good enough.
----------------
Can we avoid the divisions here and below and use multiplication instead? We won't have any round-off issues that way. What about something like:

```
unsigned Accesses = 1;
if (Legal->isAccessInterleaved(&I)) {
  Accesses = Group->getNumMembers();
  ...
}
...
ScalarizationCost *= Accesses;
...
```


================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7072-7080
+      InstWidening Decision;
+      if (InterleaveInstCost <= GatherScatterCost &&
+          InterleaveInstCost <= ScalarizationCost)
+        Decision = CM_DECISION_INTERLEAVE;
+      else if (GatherScatterCost < InterleaveInstCost &&
+               GatherScatterCost < ScalarizationCost)
+        Decision = CM_DECISION_GATHER_SCATTER;
----------------
I think we're fairly consistent in other places where we compare costs, that we prefer the scalar version if there's no benefit for vectorization. So if the scalarization cost is <= the other costs, I think we should go with that.


================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7255-7284
       auto Group = Legal->getInterleavedAccessGroup(I);
       assert(Group && "Fail to get an interleaved access group.");
 
       // Only calculate the cost once at the insert position.
       if (Group->getInsertPos() != I)
         return 0;
 
----------------
Can't this all be replaced by a call to getInterleaveGroupCost()?


================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7291-7314
       unsigned Cost = 0;
       Type *PtrTy = ToVectorTy(Ptr->getType(), VF);
 
       // Figure out whether the access is strided and get the stride value
       // if it's known in compile time
       const SCEV *PtrSCEV = getAddressAccessSCEV(Ptr, Legal, SE, TheLoop); 
 
----------------
Can't this all be replaced by a call to getMemInstScalarizationCost()?


================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7323-7327
       assert(ConsecutiveStride == 0 &&
              "Gather/Scatter are not used for consecutive stride");
       return Cost +
              TTI.getGatherScatterOpCost(I->getOpcode(), VectorTy, Ptr,
                                         Legal->isMaskRequired(I), Alignment);
----------------
Similar comment to the above. I don't think you have a helper for computing gather/scatter cost, but I think it would be nice. It would be easier to keep getInstructionCost in sync with setCostBasedWideningDecision.


Repository:
  rL LLVM

https://reviews.llvm.org/D27919





More information about the llvm-commits mailing list