[PATCH] D49491: [RFC][VPlan, SLP] Add simple SLP analysis on top of VPlan.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 19 06:09:11 PDT 2018


fhahn added a comment.

Thanks for all the feedback! My initial plan was to make VPlan based loop vectorization SLP aware, to improve on the cases mentioned earlier, where currently we do not do the best thing in LoopVectorize. That should take a bit of load off SLPVectorize, but would not replace the current SLPVectorizer for now.

I think in the long term, we should work towards re-using as much of the VPlan infrastructure in SLPVectorize as sensible and possible. Merging of #1 and #2 seems like a good approach: we can get some benefits for loop vectorization relatively quickly and in the meantime evolve the whole VPlan infrastructure around it, while making SLPVectorize and VPlanSLP share common infrastructure (e.g. start from the bottom up, initially use VPlan for code generation and move up from there (bundle scheduling, cost modelling, auxiliary analysis like interleaved load/store analysis)) where it makes sense and is beneficial. In the end we might end up with "competing" algorithm/SLP strategies, which get evaluated against each other before creating (and executing) the final VPlan.

>> I have no strong opinions on the best approach. My concern is my current work revolves around a number of issues:
>> 
>> 1 - Generalize alternate vectorisation paths (multiple different vector ops + select/shuffle merges).

Representing alternate vectorization paths/strategies in a way that can be easily evaluated against each other is one major benefit of VPlan, so  I assume it could be helpful here

>> 2 - Supporting 'copyable' elements (PR30787).
>>  3 - Pull TTI/vectorization costs from scheduling models (PR36550).

This is great and we should use the same API here in the VPlan cost modelling and SLPVectorizer.

>> 4 - Using dereferenceable_or_null metadata to vectorise loads with missing elements (PR21780).
>>  5 - Revectorisation of 128-bit vector code to 256-bit vector code (make most of YMM ops now that Jaguar model is treating them nicely).
>> 
>> All of these are large pieces of work and I don't want to find myself implementing them in SLPVectorizer just for all the work to be lost and we're back to a very basic SLP system again.
>> 
>> How quickly do you expect VPlanSLP to be close to current SLPVectorizer codegen? Ideally I'd like to see slp tests run on both asap.

This depends on when we get initial VPlan-native codegen and cost modelling. I hope those things get submitted in the next few months. As mentioned earlier, I could put together a VPlanSLP version operating on scalar code outside loops, so we have something more concrete to compare.

> That was my primary concern, too. But I don't think anyone is proposing to just dump the existing SLP unless the new one has *all* of it (and more).

+1

> And that also means either merging or commoning-up the features above (and all others).

Given the amount of tuning that went into SLPVectorize over the years, it will take a while to reach parity for a potential VPlan based replacement. But it would be great if we could agree on the general direction & approach and I would be more than happy to try to make sure the issues described above fit well into VPlanSLP. I guess with the problems mentioned above, the strategies and concepts are trickier to get right than the implementation details, so collaborating would be very valuable IMO.


https://reviews.llvm.org/D49491





More information about the llvm-commits mailing list