[PATCH] D141512: [SLP]Improve isGatherShuffledEntry by looking deeper through the reused scalars.

Wed Jan 18 18:25:59 PST 2023

vdmitrie added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:6713-6717
+    if (!(E->getOpcode() == Instruction::Load && !E->isAltShuffle() &&
+          !all_of(E->Scalars, [this](Value *V) { return getTreeEntry(V); }) &&
+          !isSplat(E->Scalars)  &&
+          (E->Scalars == GatheredScalars || GatheredScalars.size() > 2)))
+      GatherShuffle = isGatherShuffledEntry(E, GatheredScalars, Mask, Entries);
----------------
I find this "if" condition quite difficult to follow. 
I tried to simplify it by applying logical NOT into expressions in parens
 and got the following (which, if  I'm not mistaken, is equivalent):

     if (E->getOpcode() != Instruction::Load || E->isAltShuffle() ||
         all_of(E->Scalars, [this](Value *V) { return getTreeEntry(V); }) ||
         isSplat(E->Scalars) ||
         (E->Scalars != GatheredScalars && GatheredScalars.size() <= 2))

where I still don't quite follow the second part.
Besides,  what makes me nervous is use of isAltShuffle() interface here which was not originally designed for gather nodes.  We can gather for multiple reasons including inability to schedule a bundle. We just leave this information not cleared when generate a gather node for such bundle. But originally we only used it to emulate binary operations not present in a hardware to vectorize a set of scalars.
As I understand you are trying to pre-filter gather nodes probably trying to find reshuffles of maximum two distinct scalar instructions before doing further analysis and use isAltShuffle() as an indicator for two ops specifically. But shouldn't we clear instructions state for irrelevant cases when we build a tree
if we are not going to use these nodes for further re-analysis?
We probably could write a reason why scalars are being gathered (directly into a node or as a sidecar data). That would likely simplify many places where we deal with re-analysis to restore that information after we have already built a tree. 

================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:6750
       }
+      if (!all_of(GatheredScalars, UndefValue::classof))
+        GatherCost += getGatherCost(GatheredScalars);
----------------
sound like this should be inside of getGatherCost().

================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8123-8128
+    // Gather nodes usually are not scheduled and inserted before their first
+    // user node. So, instead of checking dependency between the gather nodes
+    // themselves, we check the dependency between their user nodes.
+    // If one user node comes before the second one, we cannot use the second
+    // gather node as the source vector for the first gather node, because in
+    // the list of instructions it will be emitted later.
----------------
Thanks. This is very helpful.

================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:8389
+      continue;
+    // Fix the entry number for the given scalar. If it the first entry, set
+    // Pair.first to 0, otherwise to 1 (currently select at max 2 nodes).
----------------
typo: "If it is the first entry"

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141512/new/

https://reviews.llvm.org/D141512