[PATCH] D114373: [LV] Fix incorrectly marking a pointer indvar as 'scalar'.
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 25 05:52:41 PST 2021
david-arm added a comment.
Thanks for the fix @sdesmalen! I've not quite finished reviewing all the test changes, but I have a few comments so far ... :)
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:4858
bool IsUniform = Cost->isUniformAfterVectorization(P, State.VF);
unsigned Lanes = IsUniform ? 1 : State.VF.getKnownMinValue();
----------------
If we don't expect to widen here anymore (i.e. you removed the `IsUniform && VF.isScalable()` stuff), can we also just assert(IsUniform) and only generate the first lane? If not, I think it's worth having an assert that VF is not scalable here because we're only caching the first known minimum number of lanes here.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5199
- auto isScalarPtrInduction = [&](Instruction *MemAccess, Value *Ptr) {
- if (!isa<PHINode>(Ptr) ||
- !Legal->getInductionVars().count(cast<PHINode>(Ptr)))
- return false;
- auto &Induction = Legal->getInductionVars()[cast<PHINode>(Ptr)];
- if (Induction.getKind() != InductionDescriptor::IK_PtrInduction)
- return false;
- return isScalarUse(MemAccess, Ptr);
- };
-
// A helper that evaluates a memory access's use of a pointer. If the
// pointer is actually the pointer induction of a loop, it is being
----------------
I think the comment needs updating here to remove the bits about inductions.
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll:51
+; CHECK-NEXT: [[TMP7:%.*]] = mul i64 [[TMP6]], 2
+; CHECK-NEXT: [[TMP8:%.*]] = mul i64 [[TMP7]], 1
+; CHECK-NEXT: [[TMP9:%.*]] = mul i64 1, [[TMP8]]
----------------
I'm actually surprised these aren't folded away when creating the mul in IRBuilder, but that's not a problem with your patch!
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll:64
+; CHECK-NEXT: store <vscale x 2 x i8*> [[TMP14]], <vscale x 2 x i8*>* [[TMP16]], align 8
+; CHECK-NEXT: [[TMP17:%.*]] = extractelement <vscale x 2 x i8*> [[TMP13]], i32 0
+; CHECK-NEXT: [[TMP18:%.*]] = getelementptr i8, i8* [[TMP17]], i32 0
----------------
I think this may potentially lead to poorer code quality here with the lane extract, although that may be a price worth paying for overall correctness. I'm not certain if the extract will get folded with the vector GEP and transformed into a scalar GEP here. Maybe you could run `-instcombine` on this code to double check the output?
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll:169
+; CHECK: middle.block:
+; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
+; CHECK-NEXT: br i1 [[CMP_N]], label [[END:%.*]], label [[SCALAR_PH]]
----------------
Do we need all the check lines after this point, since we only really care about the vector loop?
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-phi.ll:137
+; CHECK-NEXT: [[BC:%.*]] = bitcast <vscale x 2 x i32*> [[TMP7]] to <vscale x 2 x <vscale x 2 x i32>*>
+; CHECK-NEXT: [[TMP8:%.*]] = extractelement <vscale x 2 x <vscale x 2 x i32>*> [[BC]], i32 0
+; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 2 x i32>, <vscale x 2 x i32>* [[TMP8]], align 8
----------------
Again, it's worth checking whether ot not these extracts get folded away?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114373/new/
https://reviews.llvm.org/D114373
More information about the llvm-commits
mailing list