[PATCH] D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic

Thu Nov 5 08:33:41 PST 2020

anton-afanasyev marked 2 inline comments as done.
anton-afanasyev added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:2848
+        // Vectorizing non-consecutive loads with `llvm.masked.gather`.
+        TreeEntry *TE = newTreeEntry(VL, Bundle /*vectorized*/, S, UserTreeIdx,
+                                     ReuseShuffleIndicies);
----------------
ABataev wrote:
> anton-afanasyev wrote:
> > ABataev wrote:
> > > anton-afanasyev wrote:
> > > > ABataev wrote:
> > > > > anton-afanasyev wrote:
> > > > > > ABataev wrote:
> > > > > > > Why it is `vectorized` if this is actually kind of gather?
> > > > > > Actually I think `vectorized` is more appropriate here than `gathered` in sense it is used throughout SLPVectorizer module. Since we actually modify several instruction to _one_. I can change it to `scatter-vectorized`.
> > > > > Could you give a bit more explanation why it should be treated as `vectorized`?
> > > > Ok.
> > > > For now, `enum EntryState {Vectorize, NeedToGather}` has two states: the first is for the bundle that is to be vectorized, the second is to be gathered, but here "gathered" means this bundle will stay untouched after tree vectorization, we need no replace several scalar instructions with one vector instruction. Also we need no handle users of gathered instruction.
> > > > For the "scattered" entry we have opposite case: several scalar instructios to be replaced with `@llvm.masked.gather.*` and `@llvm.masked.scatter.*` intrinsics and we need to handle its users (at least, for `store` = `@llvm.masked.scatter.*` case (the next patch, also using `IsScattered` field)).
> > > > Am I clear?
> > > Shall we handle the users for loads?
> > No, only for stores, of course. Vectorized `load`s are leaves of vectorized tree, whereas stores are seed points. Scattered stores can also be "vectorized" in the sense of being replaced with the one intrinsic.
> Then for loads it is more just a kind a gather, rather than vectorize. Is it required to mark the scalars as vectorized or just enough to mark as gathered? Do we need all this stuff that is required for the vectorized instructions, like inorder, etc.?
Yes, we need this stuff: at least, separate cost estimation, scheduling. We treat this as generalization of vectorization -- like `load <4 x i32>`, `@llvm.masked.gather` loads to `<4 x i32>`, for instance.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90445/new/

https://reviews.llvm.org/D90445