SLP/Loop vectorizer pass ordering

Thu Oct 9 08:48:25 PDT 2014

----- Original Message -----
> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
> To: "Zinovy Nis" <zinovy.nis at gmail.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "LLVM Commits" <llvm-commits at cs.uiuc.edu>, "Tobias Grosser" <tobias at grosser.es>,
> "Chandler Carruth" <chandlerc at google.com>, "Nadav Rotem" <nrotem at apple.com>
> Sent: Thursday, October 9, 2014 10:07:42 AM
> Subject: Re: SLP/Loop vectorizer pass ordering
> 
> 
> The loop vectorizer now sees this loop:
> 
> define void
> @_Z21ambient_occlusion_vecP6_IsectR5vrandILm8EE(%struct._Isect*
> nocapture %isect, %class.vrand* nocapture readonly
> dereferenceable(32) %rng) #0 {
> entry:
>   br label %for.body
> 
> for.body:                                         ; preds =
> %for.inc.for.body_crit_edge, %entry
>   %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next,
>   %for.inc.for.body_crit_edge ]
>   %occlusion.017 = phi float [ 1.000000e+00, %entry ], [ %phitmp,
>   %for.inc.for.body_crit_edge ]
>   %exitcond = icmp eq i64 %indvars.iv, 63
>   br i1 %exitcond, label %for.end, label %for.inc.for.body_crit_edge
> 
> for.inc.for.body_crit_edge:                       ; preds = %for.body
>   %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
>   %phitmp = fadd fast float %occlusion.017, 1.000000e+00
>   br label %for.body
> 
> for.end:                                          ; preds = %for.body
>   %occlusion.017.lcssa = phi float [ %occlusion.017, %for.body ]
>   %t5 = getelementptr inbounds %struct._Isect* %isect, i64 0, i32 0
>   store float %occlusion.017.lcssa, float* %t5, align 4, !tbaa !1
>   ret void
> }
> 
> Notice that the loop exit block is the loop header and the latch is
> not guaranteed to be executed. The loop vectorizer assumes such
> loops have been rotated.
> 
> 
> If we send this IR through loop-rotate it will vectorize.
> 
> The farther away we move the loop vectorizer from loop rotate the
> likelier some optimization will destroy the rotated from. We might
> just want to run loop rotate before the loop vectorizer ...
> 

I think that makes sense -- and I don't recall loop rotation being expensive, plus is preserves just about everything (and I think does a reasonable job cleaning up after itself) ;)

I'd say we run some benchmarks, and barring any issues, we just do it.

 -Hal

> 
> 
> > On Oct 9, 2014, at 1:15 AM, Zinovy Nis <zinovy.nis at gmail.com>
> > wrote:
> > 
> > Hi.
> > 
> > Did you have a chance to look at my reproducer?
> > 
> > 2014-10-07 21:34 GMT+04:00 Zinovy Nis <zinovy.nis at gmail.com>:
> >> Hi.
> >> 
> >> I attached a reduced sample, based on
> >> https://code.google.com/p/aobench/.
> >> 
> >> Run it first with an old SLP order:
> >> 
> >> 1) clang -c -Ofast -static -march=core-avx2 aobench.cpp -Rpass=.
> >> -mllvm -debug-only=loop-vectorize -mllvm
> >> -run-slp-after-loop-vectorization=0
> >> 
> >> and then with a new order:
> >> 
> >> 2) clang -c -Ofast -static -march=core-avx2 aobench.cpp -Rpass=.
> >> -debug-only=loop-vectorize -mllvm
> >> -run-slp-after-loop-vectorization=1
> >> 
> >> and see the logs:
> >> 
> >> 1) aobench.cpp:59:9: remark: vectorized loop (vectorization
> >> factor: 8,
> >> unrolling interleave factor: 1) [-Rpass=loop-vectorize]
> >> 2) aobench.cpp:59:9: remark: loop ***not*** vectorized: use
> >> -Rpass-analysis=loop-vectorize for more info
> >> [-Rpass-missed=loop-vectorize]
> >> 
> >> LV: Found an unidentified PHI.  %occlusion.017 = phi float [
> >> 1.000000e+00, %entry ], [ %phitmp, %for.inc.for.body_crit_edge ]
> >> LV: Can't vectorize the instructions or CFG
> >> LV: Not vectorizing: Cannot prove legality.
> >> 
> >> 2014-10-06 17:46 GMT+04:00 Hal Finkel <hfinkel at anl.gov>:
> >>> ----- Original Message -----
> >>>> From: "Zinovy Nis" <zinovy.nis at gmail.com>
> >>>> To: "Hal Finkel" <hfinkel at anl.gov>
> >>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>, "Tobias Grosser"
> >>>> <tobias at grosser.es>, "Chandler Carruth"
> >>>> <chandlerc at google.com>, "Nadav Rotem" <nrotem at apple.com>,
> >>>> "Arnold Schwaighofer" <aschwaighofer at apple.com>
> >>>> Sent: Monday, October 6, 2014 8:44:28 AM
> >>>> Subject: Re: SLP/Loop vectorizer pass ordering
> >>>> 
> >>>> A bit later. At least GVN creates critical edges which are not
> >>>> handled
> >>>> by loop vectorizer then.
> >>> 
> >>> Okay, please do (this is fairly important) -- if you can extract
> >>> some relevant IR, filing a bug report would be great. Are you
> >>> saying that running SLP early inhibits GVN from creating
> >>> critical edges that the loop vectorizer does not understand?
> >>> 
> >>> Thanks again,
> >>> Hal
> >>> 
> >>>> 
> >>>> 2014-10-06 17:33 GMT+04:00 Hal Finkel <hfinkel at anl.gov>:
> >>>>> ----- Original Message -----
> >>>>>> From: "Zinovy Nis" <zinovy.nis at gmail.com>
> >>>>>> To: "Chandler Carruth" <chandlerc at google.com>
> >>>>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>, "Tobias
> >>>>>> Grosser"
> >>>>>> <tobias at grosser.es>
> >>>>>> Sent: Monday, October 6, 2014 8:19:24 AM
> >>>>>> Subject: Re: SLP/Loop vectorizer pass ordering
> >>>>>> 
> >>>>>> Please wait a while, I'm using it to revert the new order as
> >>>>>> it
> >>>>>> introduces regression in our internal benchmark: SLP was
> >>>>>> creating
> >>>>>> loop
> >>>>>> vectorization opportunities when was called before LV. Now no
> >>>>>> such
> >>>>>> opportunities are available, so we've got a regression.
> >>>>> 
> >>>>> Interesting. Can you provide any further details?
> >>>>> 
> >>>>> -Hal
> >>>>> 
> >>>>>> 
> >>>>>> 2014-10-06 3:28 GMT+04:00 Chandler Carruth
> >>>>>> <chandlerc at google.com>:
> >>>>>>> 
> >>>>>>> On Thu, Sep 4, 2014 at 6:32 AM, James Molloy
> >>>>>>> <james at jamesmolloy.co.uk>
> >>>>>>> wrote:
> >>>>>>>> 
> >>>>>>>> Hi Hal, Chandler,
> >>>>>>>> 
> >>>>>>>> r217144.
> >>>>>>> 
> >>>>>>> 
> >>>>>>> Is anyone still using the option to disable this? If I don't
> >>>>>>> hear
> >>>>>>> anything,
> >>>>>>> I'll remove this option entirely in the next week.
> >>>>>>> 
> >>>>>>> _______________________________________________
> >>>>>>> llvm-commits mailing list
> >>>>>>> llvm-commits at cs.uiuc.edu
> >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>>>>>> 
> >>>>>> _______________________________________________
> >>>>>> llvm-commits mailing list
> >>>>>> llvm-commits at cs.uiuc.edu
> >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>>>>> 
> >>>>> 
> >>>>> --
> >>>>> Hal Finkel
> >>>>> Assistant Computational Scientist
> >>>>> Leadership Computing Facility
> >>>>> Argonne National Laboratory
> >>>> 
> >>> 
> >>> --
> >>> Hal Finkel
> >>> Assistant Computational Scientist
> >>> Leadership Computing Facility
> >>> Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory