[PATCH] D141512: [SLP]Improve isGatherShuffledEntry by looking deeper through the reused scalars.
Valeriy Dmitriev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 13 10:23:32 PST 2023
vdmitrie added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8113
+ SmallPtrSet<Value *, 4> GatheredScalars(VL.begin(), VL.end());
+ auto CheckOrdering = [&](Instruction *LastEI) {
+ // Check if the user node of the TE comes after user node of EntryPtr,
----------------
please describe
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8306-8307
+ auto *PHI1 = cast<PHINode>(V1);
+ // Check that all incoming values are compatible/from same parent (if they
+ // are instructions).
+ for (int I = 0, E = PHI->getNumIncomingValues(); I < E; ++I) {
----------------
IMO, the comment is not much helpful unless "compatible" is described.
We check for specific properties: incoming values are either both constants (undef and/or poison is probably considered a constant as well) or instructions with same opcode having same parent block.
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8321
+ };
+ auto MightBeIgnored = [=](Value *V) {
+ auto *I = dyn_cast<Instruction>(V);
----------------
please add some description
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8326
+ IgnoredVals.assign(UserIgnoreList->begin(), UserIgnoreList->end());
+ return I && !IsSplat && !ScalarToTreeEntry.count(I) &&
+ !isVectorLikeInstWithConstOps(I) &&
----------------
Please comment the code?
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8330
+ };
+ auto NeighborMightBeIgnored = [&](Value *V, int Idx) {
+ Value *V1 = VL[Idx];
----------------
please add some description
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8336
+ UsedInSameVTE = It->second == UsedValuesEntry.find(V)->second;
+ return V != V1 && MightBeIgnored(V1) && !UsedInSameVTE &&
+ getSameOpcode({V, V1}, *TLI).getOpcode() &&
----------------
Please comment the code?
================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8361-8380
+ SmallVector<const TreeEntry *> TempEntries;
+ for (unsigned I = 0, Sz = Entries.size(); I < Sz; ++I) {
+ if (!UsedIdxs.test(I))
+ continue;
+ for (std::pair<unsigned, int> &Pair : EntryLanes)
+ if (Pair.first == I)
+ Pair.first = TempEntries.size();
----------------
Could you please add some comments describing the code?
================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/commutativity.ll:78-87
+; SSE-NEXT: [[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[C:%.*]], i32 0
+; SSE-NEXT: [[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> poison, <4 x i32> zeroinitializer
+; SSE-NEXT: [[TMP3:%.*]] = insertelement <4 x i32> poison, i32 [[A:%.*]], i32 0
+; SSE-NEXT: [[TMP4:%.*]] = shufflevector <4 x i32> [[TMP3]], <4 x i32> poison, <4 x i32> zeroinitializer
+; SSE-NEXT: [[TMP5:%.*]] = add <4 x i32> [[TMP2]], [[TMP4]]
+; SSE-NEXT: [[TMP6:%.*]] = insertelement <4 x i32> [[TMP3]], i32 [[B:%.*]], i32 1
+; SSE-NEXT: [[TMP7:%.*]] = insertelement <4 x i32> [[TMP6]], i32 [[C]], i32 2
----------------
Why this test changed behavior?
My expectation would be the original scalar code is likely noticeably faster than vectorized for this case.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D141512/new/
https://reviews.llvm.org/D141512
More information about the llvm-commits
mailing list