[PATCH] D125111: [SLP] Make reordering aware of external vectorizable scalar stores.

Fri May 6 16:16:24 PDT 2022

vporpo added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:3528
     if (Optional<OrdersType> CurrentOrder =
             getReorderingData(*TE, /*TopToBottom=*/true)) {
       // Do not include ordering for nodes used in the alt opcode vectorization,
----------------
ABataev wrote:
> vdmitrie wrote:
> > ABataev wrote:
> > > vporpo wrote:
> > > > ABataev wrote:
> > > > > vporpo wrote:
> > > > > > ABataev wrote:
> > > > > > > Maybe just add another analysis somewhere here instead of adding new fields etc.?
> > > > > > We could do the check here, but I think it is a bit more explicit if we have a field in TreeEntry. Also it is very similar in nature to the other reordering data, so I think they should be represented in a similar way. It also helps with debugging because you can actually see it with a `dumpVectorizableTree()` dump just like the other reordering data. Wdyt?
> > > > > I rather doubt that it is wise decision to waste some extra memory just to handle this corner case.
> > > > I don't think that memory is a concern here since the VectorizableTree does not grow too large and we clear it before we build the next one. I think it is not worth making it less explicit just to save some memory. Reordering is already quite complex and without having this explicitly shown in the dump it would just make debugging harder.
> > > > @vdmitrie any thoughts on this? 
> > > > 
> > > The reordering data is not stored in the tree, except for some cases, where this data is also required for correct codegen/cost estimation. I do not like the idea to keep this data in the tree without actually being used for cost/codegen.
> > I do not have strong objections wrt to keeping it in the tree although Alexey's arguments sound very reasonable.  If debug printing is the issue then may be it worth trying to tweak debug printing routine(s) instead? dumpVectorizableTree() for example could collect this data and print it alongside with each tree node.
> Better to teach reordering functions about dumping, rather than put some service data to the tree structure.
It is not just about debugging, it is also about the design.

I agree with Alexey that, we should not keep data in the tree that is purely temporal. But I think in this case it is not temporal data. I believe it is a good design principle to separate phases, just because we can place the analysis in the reorder function, I don't think we should. Please bear in mind that we may decide in the future to extend the analysis to cover more cross-tree cases like this by doing a more extensive search, so the analysis could grow. 

If we do decide to have this analysis as a separate phase, then the natural place for holding the ordering data is the TreeEntry. I don't think this should be restricted to data passed to the cost/codegen phase only. Any data passed from analysis to transformations should qualify, reordering included.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125111/new/

https://reviews.llvm.org/D125111