[PATCH] D106399: [VectorCombine] Widening of partial vector loads

Wed Jul 21 07:32:45 PDT 2021

spatel added a comment.

Thanks for working on this! I agree that this is useful independent of whatever we can/should do to improve SLP.

================
Comment at: llvm/lib/Transforms/Vectorize/VectorCombine.cpp:290
+
+  // Okay, we currently load less than full worth of the legalized vectors.
+  // If we'd widen the load, would that be more costly than the current load?
----------------
worth -> width

================
Comment at: llvm/lib/Transforms/Vectorize/VectorCombine.cpp:312-313
+  std::iota(Mask.begin(), Mask.end(), 0);
+  Value *SmallVec =
+      Builder.CreateShuffleVector(WideVecLd, PoisonValue::get(WideVecTy), Mask);
+  replaceValue(I, *SmallVec);
----------------
Use the unary variant of `CreateShuffleVector` here.
Could call this value `ExtractSubvector` or state that in the code comment. Can we always assume that the extract op is free, or should we add that potential cost into the equation?

================
Comment at: llvm/test/Transforms/VectorCombine/X86/load-inseltpoison.ll:590
 ; CHECK-NEXT:    [[TMP1:%.*]] = bitcast <2 x float>* [[P:%.*]] to <4 x float>*
-; CHECK-NEXT:    [[TMP2:%.*]] = load <4 x float>, <4 x float>* [[TMP1]], align 16
-; CHECK-NEXT:    [[R:%.*]] = shufflevector <4 x float> [[TMP2]], <4 x float> poison, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
+; CHECK-NEXT:    [[TMP2:%.*]] = load <4 x float>, <4 x float>* [[TMP1]], align 4
+; CHECK-NEXT:    [[L:%.*]] = shufflevector <4 x float> [[TMP2]], <4 x float> poison, <2 x i32> <i32 0, i32 1>
----------------
Can we preserve the better alignment?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106399/new/

https://reviews.llvm.org/D106399