[PATCH] D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic

Anton Afanasyev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 14 09:49:48 PST 2020


anton-afanasyev added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:1556
     /// Do we need to gather this sequence ?
-    enum EntryState { Vectorize, NeedToGather };
+    enum EntryState { Vectorize, ScatterVectorize, NeedToGather };
     EntryState State;
----------------
RKSimon wrote:
> Minor - please add comment explaining the purpose of each enum
Thank you, done!


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:2847
+                                     ReuseShuffleIndicies);
+        TE->State = TreeEntry::ScatterVectorize;
+        TE->setOperandsInOrder();
----------------
ABataev wrote:
> anton-afanasyev wrote:
> > ABataev wrote:
> > > anton-afanasyev wrote:
> > > > ABataev wrote:
> > > > > Better to modify `Bundle` parameter in `newTreeEntry` function, something like `Optional<std::pair<EntryState, Bundle>>` and pass `EntryState` directly at function call. You can implement it as a separate NFC patch.
> > > > Ok, I can fix it in a separate patch. But now I doubt that initial memory optimization with `IsScatteredOpds` elimination was worse than this final `EntryState` expansion requiring changes for all `newTreeEntry()` calls.
> > > Yes, it still will be better. Here you will allocate the value on the stack, which will be cleared after function termination. With an extra data member each TreeEntry will retain it till the end of life of the whole tree.
> > Ok
> Looks like you still do not pass the vectorization state as an argument. I thought we agreed to modify `Bundle` related param to be an `Optional` of EntryState and Bundle types? Better to create the entry with the correctly set inner state rather than modify them outside of the class. It breaks encapsulation.
Sorry, missed this comment. Yes, we agreed to do this, but as a separate NFC patch.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:3670
+      }
       if (!E->ReorderIndices.empty()) {
         // TODO: Merge this shuffle with the ReuseShuffleCost.
----------------
RKSimon wrote:
> Do we have cases where we end up shuffling scattered ops?
In general, yes, we can end up with reordering, but this could be optimized with preliminary pointers vector shuffling. Though I think it's rather rare case (load-gather plus reordering meeting together), I can prepare separate patch to optimize this after investigation of its impact.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90445/new/

https://reviews.llvm.org/D90445



More information about the llvm-commits mailing list