[PATCH] D144958: [SLP]Initial support for reshuffling of non-starting buildvector/gather nodes.
Valeriy Dmitriev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 28 18:12:59 PST 2023
vdmitrie added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:9435
+ Mask.size() ==
+ cast<FixedVectorType>(V1->getType())->getNumElements() &&
+ all_of(Mask, [=](int Idx) { return Idx < Limit; }) &&
----------------
nit:
These
unsigned VF1 = cast<FixedVectorType>(V1->getType())->getNumElements();
unsigned VF2 = cast<FixedVectorType>(V2->getType())->getNumElements();
can be hoisted above if at line 9412 and VF1 then reused here.
the "if" condition at 9412 can be then changed into "VF1 != VF2"
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:9452
+ [=](const TreeEntry *E,
+ ArrayRef<const TreeEntry *> Deps) -> std::optional<Value *> {
+ // No need to delay emission if all deps are ready.
----------------
nit: It looks like optional for return type isn't really needed here. Returning Value* type and nullptr instead of std::nullopt should do the trick (assuming appropriate changes at call site)
================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/PR35865.ll:7
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = extractelement <16 x half> undef, i32 4
-; CHECK-NEXT: [[TMP1:%.*]] = extractelement <16 x half> undef, i32 5
-; CHECK-NEXT: [[TMP2:%.*]] = insertelement <2 x half> poison, half [[TMP0]], i32 0
-; CHECK-NEXT: [[TMP3:%.*]] = insertelement <2 x half> [[TMP2]], half [[TMP1]], i32 1
-; CHECK-NEXT: [[TMP4:%.*]] = fpext <2 x half> [[TMP3]] to <2 x float>
-; CHECK-NEXT: [[TMP5:%.*]] = bitcast <2 x float> [[TMP4]] to <2 x i32>
-; CHECK-NEXT: [[TMP6:%.*]] = shufflevector <2 x i32> [[TMP5]], <2 x i32> poison, <8 x i32> <i32 0, i32 1, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef>
-; CHECK-NEXT: [[VECINS_I_5_I1:%.*]] = shufflevector <8 x i32> [[TMP6]], <8 x i32> undef, <8 x i32> <i32 8, i32 9, i32 10, i32 11, i32 0, i32 1, i32 14, i32 15>
; CHECK-NEXT: ret void
;
----------------
Hm, this looks weird that now SLP vectorizer works like dead code elimination pass. May be update the test so that it would not be dead code instead?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144958/new/
https://reviews.llvm.org/D144958
More information about the llvm-commits
mailing list