[PATCH] D27919: [Loop Vectorizer] Interleave vs Gather - in some cases Gather is better.
Michael Kuperstein via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 6 17:18:16 PST 2017
mkuper added a comment.
The high-level structure of this looks good to me, thanks Elena!
Some minor/style comments inline.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:1930
+ enum InstWidening {
+ CM_DECISION_NONE,
+ CM_DECISION_WIDEN,
----------------
delena wrote:
> mssimpso wrote:
> > How are we using CM_DECISION_NONE? Aren't we forced to make some sort of decision? I would think the default would be CM_DECISION_WIDEN unless the cost model points to one of the others.
> I used it while capturing decisions for an interleave group. At this point I can get rid of it. I return "NONE" if no decision, and it is convenient.
> InstWidening getWideningDecision(Instruction *I, unsigned VF) {
> assert(VF >= 2 && "Expected VF >=2");
> std::pair<Instruction *, unsigned> InstOnVF = std::make_pair(I, VF);
> if (!WideningDecisions.count(InstOnVF))
> return CM_DECISION_NONE;
> return WideningDecisions[InstOnVF];
> }
> Then I use it in assertion, to make sure that decision is made.
Maybe rename NONE to UNKNOWN, then? But I'm fine with None if you think that's clearer.
Also, I looked up the naming convention for enums in the coding standard, and I think it should be something like "CM_Widen, CM_Interleave", etc, not all caps.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:330
+static Type *getMemInstValueType(Value *I) {
+ assert((isa<LoadInst>(I) || isa<StoreInst>(I)) &&
----------------
We probably already have several clones of those functions around the code-base. And they are probably all slightly different.
LSR has getAccessType(), LAA has getAddressSpaceOperand(), LoadStoreVectorizer has getPointerAddressSpace(), and I'm sure there are more.
I don't want to make merging them a precondition of this patch, but can you please at least add a FIXME here?
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:1929
+ auto UniformsPerVF = Uniforms.find(VF);
+ return UniformsPerVF->second.count(I);
+ }
----------------
Why not just "return Uniforms[VF]->second.count(I);"? I don't think the verbosity helps here, and we don't actually care about the difference between find() and operator[] due to the assert.
Or, to save a lookup in an asserts build, you could find() and then assert on the result of the find().
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:1937
+ assert(Scalars.count(VF) && "Scalar values are not calculated for VF");
+ auto ScalarsPerVF = Scalars.find(VF);
+ return ScalarsPerVF->second.count(I);
----------------
Same as above.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:1988
+ std::pair<Instruction *, unsigned> InstOnVF = std::make_pair(I, VF);
+ if (!WideningDecisions.count(InstOnVF))
+ return CM_DECISION_NONE;
----------------
Here, on the other hand, I think find() would be better - that way you don't need two lookups.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:2120
+ if (VF == 1 || Uniforms.count(VF))
+ return;
+ setCostBasedWideningDecision(VF);
----------------
I'm still not sure I understand why this gets called twice for a user-provided VF. Could you explain again?
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:5512
auto *Ptr = getPointerOperand(I);
if (LI && isUniform(Ptr))
+ return false;
----------------
Do we still want this check here?
I mean:
1) An instruction with a uniform pointer *can* be widended.
2) That ties in with the way this is used - we check for the uniform case before the widening case, as we should. So, IIUC, we should never actually hit this.
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7022
+ // Choose between Interleaving, Gather/Scatter or Scalarization.
+ unsigned InterleaveCost = (unsigned)(-1);
+ unsigned NumAccesses = 1;
----------------
UINT_MAX
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7036
+ if (InterleaveCost < NumAccesses * 2) {
+ // The interleaving cost is good enough.
+ setWideningDecision(Group, VF, CM_DECISION_INTERLEAVE,
----------------
Why the NumAccesses * 2 cut-off?
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7047
+ ? getGatherScatterCost(&I, VF) * NumAccesses
+ : (unsigned)(-1);
+
----------------
UINT_MAX
================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7060
+ Cost = InterleaveCost;
+ } else if (GatherScatterCost < InterleaveCost &&
+ GatherScatterCost < ScalarizationCost) {
----------------
Why do you need the "GatherScatterCost < InterleaveCost" check here?
Repository:
rL LLVM
https://reviews.llvm.org/D27919
More information about the llvm-commits
mailing list