[PATCH] D27919: [Loop Vectorizer] Interleave vs Gather - in some cases Gather is better.
Michael Kuperstein via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 7 09:27:10 PST 2017
mkuper added inline comments.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:1929
+ auto UniformsPerVF = Uniforms.find(VF);
+ return UniformsPerVF->second.count(I);
+ }
----------------
delena wrote:
> mkuper wrote:
> > Why not just "return Uniforms[VF]->second.count(I);"? I don't think the verbosity helps here, and we don't actually care about the difference between find() and operator[] due to the assert.
> > Or, to save a lookup in an asserts build, you could find() and then assert on the result of the find().
> The "const" qualifier does not allow the Uniforms[VF] form.
Ohh, right, missed it's const, sorry.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:2120
+ if (VF == 1 || Uniforms.count(VF))
+ return;
+ setCostBasedWideningDecision(VF);
----------------
delena wrote:
> mkuper wrote:
> > I'm still not sure I understand why this gets called twice for a user-provided VF. Could you explain again?
> We call this function from multiple places. From calculateRegisterUsage(), selectVectorizationFactor() - user-defined-VF and from expectedCost().
> I just prevent recalculation.
I understand, I'm just wondering whether we really need to do that.
Anyway, it's not a new problem, we don't have to solve it here.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:5512
auto *Ptr = getPointerOperand(I);
if (LI && isUniform(Ptr))
+ return false;
----------------
mkuper wrote:
> Do we still want this check here?
> I mean:
> 1) An instruction with a uniform pointer *can* be widended.
> 2) That ties in with the way this is used - we check for the uniform case before the widening case, as we should. So, IIUC, we should never actually hit this.
I think you may have missed this comment.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7036
+ if (InterleaveCost < NumAccesses * 2) {
+ // The interleaving cost is good enough.
+ setWideningDecision(Group, VF, CM_DECISION_INTERLEAVE,
----------------
delena wrote:
> mkuper wrote:
> > Why the NumAccesses * 2 cut-off?
> I consider a cost per instruction. In this case InterleaveCost / NumAccesses == 1. (Matthew asked to avoid divisions).
> About 1 inst per access is good enough. I added more comments.
Well, this isn't really "about 1", this is "below 2".
I'd be more conservative here (InerelaveCost <= NumAccesses ? Can this even happen? Or are you trying to catch the cases where the ratio is ~1.1-1.2?). Or maybe remove this altogether. Is getGatherScatterCost() expensive in terms of compile time?
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7069
+ Cost = InterleaveCost;
+ } else if (GatherScatterCost < ScalarizationCost) {
+ Decision = CM_GatherScatter;
----------------
So, in case the costs are equal, you prefer scalarization to interleaving, and interleaving to scatter/gather?
(In theory it shouldn't matter what happens when the costs are equal, just making sure this I understand.)
Repository:
rL LLVM
https://reviews.llvm.org/D27919
More information about the llvm-commits
mailing list