SLP/Loop vectorizer pass ordering

Fri Jul 25 09:15:54 PDT 2014

----- Original Message -----
> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> To: "Tobias Grosser" <tobias at grosser.es>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>, "Hal Finkel" <hfinkel at anl.gov>, "Nadav Rotem" <nrotem at apple.com>,
> "Erik Eckstein" <eeckstein at apple.com>
> Sent: Friday, July 25, 2014 11:10:12 AM
> Subject: Re: SLP/Loop vectorizer pass ordering
> 
> 
> 
> 
> 
> 
> On Jul 25, 2014, at 8:48 AM, Tobias Grosser < tobias at grosser.es >
> wrote:
> 
> On 25/07/2014 17:41, James Molloy wrote:
> 
> 
> Hi Nadav, Arnold,
> 
> 
> 
> I've come across an interesting optimization problem in one of the
> SPEC
> benchmarks. There is a loop that can be optimized by both the SLP
> vectorizer
> and the loop vectorizer (when I patch the loop vectorizer to deal
> with fsub
> reductions).
> 
> 
> 
> The SLP vectorizer actually makes the performance worse - I think
> this is
> due to a lack of loop unrolling afterwards. The Loop vectorizer can
> improve
> the performance.
> 
> 
> 
> However, the loop vectorizer runs after the SLP vectorizer, so it
> never gets
> a chance. I'd have thought the ideal order would be Loop Vectorizer
> -> SLP
> vectorizer -> BB vectorizer, given that the loop vectorizer if it can
> run
> will probably give greater speedup than SLP.
> 
> 
> 
> The current sequence is SLP vectorizer -> BB vectorizer -> Loop
> vectorizer.
> 
> Even though I was not directly addressed. I still reply.
> 
> The proposed new order is what I think makes most sense. I wonder
> what was the reason to go for the current order. Nadav, Arnold did
> you choose this order?
> 
> 
> See my answer to James. We loose noalias information during inlining
> (until recently - thanks Hal for moving forward with the scoped
> noalias feature!).

As of a few minutes ago, you can now test this with -enable-noalias-to-md-conversion -- it is still off by default pending more-comprehensive testing -- please feel free to help provide such testing ;)

Thanks again,
Hal

> 
> 
> We should absolutely evaluate moving the SLPVectorizer out of the
> inliner PM once we have working scoped noalias information.
> 
> 
> Inlining could obscure patterns of parallelism the SLPVectorizer can
> recognize so their might be some losses - Erik’s work on making the
> SLPVectorizer more robust with respect to recognizing what schedules
> prevent vectorization plus work on recognizing more (possible
> mid-tree) patterns should enable us to deal with some of the fallout
> I think.

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory