[PATCH] D114373: [LV] Fix incorrectly marking a pointer indvar as 'scalar'.

Thu Nov 25 05:52:41 PST 2021

david-arm added a comment.

Thanks for the fix @sdesmalen! I've not quite finished reviewing all the test changes, but I have a few comments so far ... :)

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:4858
       bool IsUniform = Cost->isUniformAfterVectorization(P, State.VF);
       unsigned Lanes = IsUniform ? 1 : State.VF.getKnownMinValue();

----------------
If we don't expect to widen here anymore (i.e. you removed the `IsUniform && VF.isScalable()` stuff), can we also just assert(IsUniform) and only generate the first lane? If not, I think it's worth having an assert that VF is not scalable here because we're only caching the first known minimum number of lanes here.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5199

-  auto isScalarPtrInduction = [&](Instruction *MemAccess, Value *Ptr) {
-    if (!isa<PHINode>(Ptr) ||
-        !Legal->getInductionVars().count(cast<PHINode>(Ptr)))
-      return false;
-    auto &Induction = Legal->getInductionVars()[cast<PHINode>(Ptr)];
-    if (Induction.getKind() != InductionDescriptor::IK_PtrInduction)
-      return false;
-    return isScalarUse(MemAccess, Ptr);
-  };
-
   // A helper that evaluates a memory access's use of a pointer. If the
   // pointer is actually the pointer induction of a loop, it is being
----------------
I think the comment needs updating here to remove the bits about inductions.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll:51
+; CHECK-NEXT:    [[TMP7:%.*]] = mul i64 [[TMP6]], 2
+; CHECK-NEXT:    [[TMP8:%.*]] = mul i64 [[TMP7]], 1
+; CHECK-NEXT:    [[TMP9:%.*]] = mul i64 1, [[TMP8]]
----------------
I'm actually surprised these aren't folded away when creating the mul in IRBuilder, but that's not a problem with your patch!

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll:64
+; CHECK-NEXT:    store <vscale x 2 x i8*> [[TMP14]], <vscale x 2 x i8*>* [[TMP16]], align 8
+; CHECK-NEXT:    [[TMP17:%.*]] = extractelement <vscale x 2 x i8*> [[TMP13]], i32 0
+; CHECK-NEXT:    [[TMP18:%.*]] = getelementptr i8, i8* [[TMP17]], i32 0
----------------
I think this may potentially lead to poorer code quality here with the lane extract, although that may be a price worth paying for overall correctness. I'm not certain if the extract will get folded with the vector GEP and transformed into a scalar GEP here. Maybe you could run `-instcombine` on this code to double check the output?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll:169
+; CHECK:       middle.block:
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[CMP_N]], label [[END:%.*]], label [[SCALAR_PH]]
----------------
Do we need all the check lines after this point, since we only really care about the vector loop?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-phi.ll:137
+; CHECK-NEXT:    [[BC:%.*]] = bitcast <vscale x 2 x i32*> [[TMP7]] to <vscale x 2 x <vscale x 2 x i32>*>
+; CHECK-NEXT:    [[TMP8:%.*]] = extractelement <vscale x 2 x <vscale x 2 x i32>*> [[BC]], i32 0
+; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <vscale x 2 x i32>, <vscale x 2 x i32>* [[TMP8]], align 8
----------------
Again, it's worth checking whether ot not these extracts get folded away?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114373/new/

https://reviews.llvm.org/D114373