[PATCH] D27919: [Loop Vectorizer] Interleave vs Gather - in some cases Gather is better.
Matthew Simpson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 2 14:00:06 PST 2017
mssimpso added a comment.
Hi Elena,
I like the direction this patch is going. Thanks for all your work. Here are some more comments inline.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7053
+
+ if (InterleaveGrpCost / Group->getNumMembers() <= 1) {
+ // The interleaving cost is good enough.
----------------
Can we avoid the divisions here and below and use multiplication instead? We won't have any round-off issues that way. What about something like:
```
unsigned Accesses = 1;
if (Legal->isAccessInterleaved(&I)) {
Accesses = Group->getNumMembers();
...
}
...
ScalarizationCost *= Accesses;
...
```
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7072-7080
+ InstWidening Decision;
+ if (InterleaveInstCost <= GatherScatterCost &&
+ InterleaveInstCost <= ScalarizationCost)
+ Decision = CM_DECISION_INTERLEAVE;
+ else if (GatherScatterCost < InterleaveInstCost &&
+ GatherScatterCost < ScalarizationCost)
+ Decision = CM_DECISION_GATHER_SCATTER;
----------------
I think we're fairly consistent in other places where we compare costs, that we prefer the scalar version if there's no benefit for vectorization. So if the scalarization cost is <= the other costs, I think we should go with that.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7255-7284
auto Group = Legal->getInterleavedAccessGroup(I);
assert(Group && "Fail to get an interleaved access group.");
// Only calculate the cost once at the insert position.
if (Group->getInsertPos() != I)
return 0;
----------------
Can't this all be replaced by a call to getInterleaveGroupCost()?
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7291-7314
unsigned Cost = 0;
Type *PtrTy = ToVectorTy(Ptr->getType(), VF);
// Figure out whether the access is strided and get the stride value
// if it's known in compile time
const SCEV *PtrSCEV = getAddressAccessSCEV(Ptr, Legal, SE, TheLoop);
----------------
Can't this all be replaced by a call to getMemInstScalarizationCost()?
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7323-7327
assert(ConsecutiveStride == 0 &&
"Gather/Scatter are not used for consecutive stride");
return Cost +
TTI.getGatherScatterOpCost(I->getOpcode(), VectorTy, Ptr,
Legal->isMaskRequired(I), Alignment);
----------------
Similar comment to the above. I don't think you have a helper for computing gather/scatter cost, but I think it would be nice. It would be easier to keep getInstructionCost in sync with setCostBasedWideningDecision.
Repository:
rL LLVM
https://reviews.llvm.org/D27919
More information about the llvm-commits
mailing list