SLP/Loop vectorizer pass ordering

Mon Oct 13 17:44:36 PDT 2014

(and the flag went in in r219644 and is '-extra-vectorizer-passes', sorry
for failing to mention that)

On Mon, Oct 13, 2014 at 5:44 PM, Chandler Carruth <chandlerc at google.com>
wrote:

> So, I've added a flag (off by default naturally) which adds several passes
> that folks have suggested either here or in other conversations around the
> vectorizers.
>
> The theory behind these suggestions makes a lot of sense to me, but this
> will be one of the hard things to benchmark, so I wanted to just get an
> easy switch in place that anyone could try out and report back results.
>
> I'll start a general discussion about this on a new thread. I think at
> least loop-rotate makes perfect sense, and we'll see if the others seem
> worth their compile-time cost.
>
> On Thu, Oct 9, 2014 at 1:51 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
>
>> On Thu, Oct 9, 2014 at 1:46 PM, Gerolf Hoflehner <ghoflehner at apple.com>
>> wrote:
>>
>>> Are you going to test ARM and x86? Otherwise could you send out your
>>> patch even though it is preliminary?
>>>
>>
>> Only x86 sadly. I'll send it out later today hopefully.
>>
>>>
>>> Thanks
>>> Gerolf
>>>
>>> On Oct 9, 2014, at 12:44 PM, Chandler Carruth <chandlerc at google.com>
>>> wrote:
>>>
>>> I have a patch I've been testing to clean up a lot of the passes around
>>> the vectorizers. I'll add this in and finish testing it, then send it out
>>> with numbers.
>>> On Oct 9, 2014 12:40 PM, "Andrew Trick" <atrick at apple.com> wrote:
>>>
>>>>
>>>> On Oct 9, 2014, at 8:48 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>>>>
>>>> ----- Original Message -----
>>>>
>>>> From: "Arnold Schwaighofer" <aschwaighofer at apple.com>
>>>> To: "Zinovy Nis" <zinovy.nis at gmail.com>
>>>> Cc: "Hal Finkel" <hfinkel at anl.gov>, "LLVM Commits" <
>>>> llvm-commits at cs.uiuc.edu>, "Tobias Grosser" <tobias at grosser.es>,
>>>> "Chandler Carruth" <chandlerc at google.com>, "Nadav Rotem" <
>>>> nrotem at apple.com>
>>>> Sent: Thursday, October 9, 2014 10:07:42 AM
>>>> Subject: Re: SLP/Loop vectorizer pass ordering
>>>>
>>>>
>>>> The loop vectorizer now sees this loop:
>>>>
>>>> define void
>>>> @_Z21ambient_occlusion_vecP6_IsectR5vrandILm8EE(%struct._Isect*
>>>> nocapture %isect, %class.vrand* nocapture readonly
>>>> dereferenceable(32) %rng) #0 {
>>>> entry:
>>>>  br label %for.body
>>>>
>>>> for.body:                                         ; preds =
>>>> %for.inc.for.body_crit_edge, %entry
>>>>  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next,
>>>>  %for.inc.for.body_crit_edge ]
>>>>  %occlusion.017 = phi float [ 1.000000e+00, %entry ], [ %phitmp,
>>>>  %for.inc.for.body_crit_edge ]
>>>>  %exitcond = icmp eq i64 %indvars.iv, 63
>>>>  br i1 %exitcond, label %for.end, label %for.inc.for.body_crit_edge
>>>>
>>>> for.inc.for.body_crit_edge:                       ; preds = %for.body
>>>>  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
>>>>  %phitmp = fadd fast float %occlusion.017, 1.000000e+00
>>>>  br label %for.body
>>>>
>>>> for.end:                                          ; preds = %for.body
>>>>  %occlusion.017.lcssa = phi float [ %occlusion.017, %for.body ]
>>>>  %t5 = getelementptr inbounds %struct._Isect* %isect, i64 0, i32 0
>>>>  store float %occlusion.017.lcssa, float* %t5, align 4, !tbaa !1
>>>>  ret void
>>>> }
>>>>
>>>> Notice that the loop exit block is the loop header and the latch is
>>>> not guaranteed to be executed. The loop vectorizer assumes such
>>>> loops have been rotated.
>>>>
>>>>
>>>> If we send this IR through loop-rotate it will vectorize.
>>>>
>>>> The farther away we move the loop vectorizer from loop rotate the
>>>> likelier some optimization will destroy the rotated from. We might
>>>> just want to run loop rotate before the loop vectorizer ...
>>>>
>>>>
>>>> I think that makes sense -- and I don't recall loop rotation being
>>>> expensive, plus is preserves just about everything (and I think does a
>>>> reasonable job cleaning up after itself) ;)
>>>>
>>>> I'd say we run some benchmarks, and barring any issues, we just do it.
>>>>
>>>>
>>>> Well, that is a classic candidate for rotate. So assuming whatever GVN
>>>> is doing is sane, then I’d say it makes sense to rerun rotation.
>>>>
>>>> -Andy
>>>>
>>>>
>>>>
>>>>
>>>> On Oct 9, 2014, at 1:15 AM, Zinovy Nis <zinovy.nis at gmail.com>
>>>> wrote:
>>>>
>>>> Hi.
>>>>
>>>> Did you have a chance to look at my reproducer?
>>>>
>>>> 2014-10-07 21:34 GMT+04:00 Zinovy Nis <zinovy.nis at gmail.com>:
>>>>
>>>> Hi.
>>>>
>>>> I attached a reduced sample, based on
>>>> https://code.google.com/p/aobench/.
>>>>
>>>> Run it first with an old SLP order:
>>>>
>>>> 1) clang -c -Ofast -static -march=core-avx2 aobench.cpp -Rpass=.
>>>> -mllvm -debug-only=loop-vectorize -mllvm
>>>> -run-slp-after-loop-vectorization=0
>>>>
>>>> and then with a new order:
>>>>
>>>> 2) clang -c -Ofast -static -march=core-avx2 aobench.cpp -Rpass=.
>>>> -debug-only=loop-vectorize -mllvm
>>>> -run-slp-after-loop-vectorization=1
>>>>
>>>> and see the logs:
>>>>
>>>> 1) aobench.cpp:59:9: remark: vectorized loop (vectorization
>>>> factor: 8,
>>>> unrolling interleave factor: 1) [-Rpass=loop-vectorize]
>>>> 2) aobench.cpp:59:9: remark: loop ***not*** vectorized: use
>>>> -Rpass-analysis=loop-vectorize for more info
>>>> [-Rpass-missed=loop-vectorize]
>>>>
>>>> LV: Found an unidentified PHI.  %occlusion.017 = phi float [
>>>> 1.000000e+00, %entry ], [ %phitmp, %for.inc.for.body_crit_edge ]
>>>> LV: Can't vectorize the instructions or CFG
>>>> LV: Not vectorizing: Cannot prove legality.
>>>>
>>>> 2014-10-06 17:46 GMT+04:00 Hal Finkel <hfinkel at anl.gov>:
>>>>
>>>> ----- Original Message -----
>>>>
>>>> From: "Zinovy Nis" <zinovy.nis at gmail.com>
>>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>, "Tobias Grosser"
>>>> <tobias at grosser.es>, "Chandler Carruth"
>>>> <chandlerc at google.com>, "Nadav Rotem" <nrotem at apple.com>,
>>>> "Arnold Schwaighofer" <aschwaighofer at apple.com>
>>>> Sent: Monday, October 6, 2014 8:44:28 AM
>>>> Subject: Re: SLP/Loop vectorizer pass ordering
>>>>
>>>> A bit later. At least GVN creates critical edges which are not
>>>> handled
>>>> by loop vectorizer then.
>>>>
>>>>
>>>> Okay, please do (this is fairly important) -- if you can extract
>>>> some relevant IR, filing a bug report would be great. Are you
>>>> saying that running SLP early inhibits GVN from creating
>>>> critical edges that the loop vectorizer does not understand?
>>>>
>>>> Thanks again,
>>>> Hal
>>>>
>>>>
>>>> 2014-10-06 17:33 GMT+04:00 Hal Finkel <hfinkel at anl.gov>:
>>>>
>>>> ----- Original Message -----
>>>>
>>>> From: "Zinovy Nis" <zinovy.nis at gmail.com>
>>>> To: "Chandler Carruth" <chandlerc at google.com>
>>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>, "Tobias
>>>> Grosser"
>>>> <tobias at grosser.es>
>>>> Sent: Monday, October 6, 2014 8:19:24 AM
>>>> Subject: Re: SLP/Loop vectorizer pass ordering
>>>>
>>>> Please wait a while, I'm using it to revert the new order as
>>>> it
>>>> introduces regression in our internal benchmark: SLP was
>>>> creating
>>>> loop
>>>> vectorization opportunities when was called before LV. Now no
>>>> such
>>>> opportunities are available, so we've got a regression.
>>>>
>>>>
>>>> Interesting. Can you provide any further details?
>>>>
>>>> -Hal
>>>>
>>>>
>>>> 2014-10-06 3:28 GMT+04:00 Chandler Carruth
>>>> <chandlerc at google.com>:
>>>>
>>>>
>>>> On Thu, Sep 4, 2014 at 6:32 AM, James Molloy
>>>> <james at jamesmolloy.co.uk>
>>>> wrote:
>>>>
>>>>
>>>> Hi Hal, Chandler,
>>>>
>>>> r217144.
>>>>
>>>>
>>>>
>>>> Is anyone still using the option to disable this? If I don't
>>>> hear
>>>> anything,
>>>> I'll remove this option entirely in the next week.
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>>
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>>
>>>>
>>>>
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>>
>>>>
>>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141013/61074dd1/attachment.html>