[PATCH] D132892: [LV] Explicily lower uniform load as single load
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 31 03:16:39 PDT 2022
david-arm added a comment.
I think this looks like a nice, logical improvement. Having a new uniform widening decision seems sensible. I have a few mostly minor comments, but I was a little surprised by a couple of test changes.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:4262
return WideningDecision == CM_Scalarize;
+
assert(Ptr == getLoadStorePointerOperand(MemAccess) &&
----------------
nit: White space change.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:4712
+ // Only try to scalarize the uniform memop itself if legal and we're not
+ // using the direct lowering strategy (which is strictly better).
----------------
I'm not sure what the "if legal" bit refers to here? If we're treating this as uniform then it's because Legal->isUniformMemOp has returned true. I think you could just say:
// Only try to scalarize the uniform memop itself if we're not using the direct lowering ...
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9746
+ // FLAGIT
+ assert(!isUniformMemOp() &&
----------------
Not sure what this comment means?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9783
+ if (isUniformMemOp()) {
+ Value *Addr = State.get(getAddr(), {0, 0});
+ auto *NewLI = Builder.CreateAlignedLoad(ScalarDataTy, Addr,
----------------
I think we normally write this as
Value *Addr = State.get(getAddr(), VPIteration(0, 0));
which is what we've done elsewhere.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9785
+ auto *NewLI = Builder.CreateAlignedLoad(ScalarDataTy, Addr,
+ Alignment);
+ // Add metadata to the load, but setVectorValue to the reverse shuffle.
----------------
Looks like this can fit on the end of the line above?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9786
+ Alignment);
+ // Add metadata to the load, but setVectorValue to the reverse shuffle.
+ State.addMetadata(NewLI, LI);
----------------
Not sure I understand this comment? Why does this involve a reverse shuffle?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:1774
+ bool isUniformMemOp() const { return UniformMemOp; }
+
----------------
nit: I think this should have a simple comment like the functions above.
================
Comment at: llvm/test/Transforms/LoopVectorize/X86/cost-model-assert.ll:26
; CHECK: vector.ph:
-; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP2]], 8
+; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP2]], 4
; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP2]], [[N_MOD_VF]]
----------------
This looks a bit unexpected. Do you know why the minimum iteration check has changed here from 8 to 4? It sounds like we've made a different choice of VF and/or interleave count (IC). I can only assume that the cost of some instructions has changed?
================
Comment at: llvm/test/Transforms/LoopVectorize/first-order-recurrence-sink-replicate-region.ll:476
; CHECK-NEXT: loop.0:
-; CHECK-NEXT: CLONE ir<[[L]]> = load ir<%src>
+; CHECK-NEXT: REPLICATE ir<[[L]]> = load ir<%src>
; CHECK-NEXT: EMIT vp<[[SPLICE:%.+]]> = first-order splice ir<%.pn> ir<[[L]]>
----------------
Do you know why this has changed? As far as I understand it, 'CLONE' literally means clone the scalar instruction once, whereas REPLICATE means insert copies into each lane of the vector. So in a sense we're now doing more loads than before I think?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D132892/new/
https://reviews.llvm.org/D132892
More information about the llvm-commits
mailing list