SLP/Loop vectorizer pass ordering

Thu Oct 9 08:07:42 PDT 2014

The loop vectorizer now sees this loop:

define void @_Z21ambient_occlusion_vecP6_IsectR5vrandILm8EE(%struct._Isect* nocapture %isect, %class.vrand* nocapture readonly dereferenceable(32) %rng) #0 {
entry:
  br label %for.body

for.body:                                         ; preds = %for.inc.for.body_crit_edge, %entry
  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.inc.for.body_crit_edge ]
  %occlusion.017 = phi float [ 1.000000e+00, %entry ], [ %phitmp, %for.inc.for.body_crit_edge ]
  %exitcond = icmp eq i64 %indvars.iv, 63
  br i1 %exitcond, label %for.end, label %for.inc.for.body_crit_edge

for.inc.for.body_crit_edge:                       ; preds = %for.body
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
  %phitmp = fadd fast float %occlusion.017, 1.000000e+00
  br label %for.body

for.end:                                          ; preds = %for.body
  %occlusion.017.lcssa = phi float [ %occlusion.017, %for.body ]
  %t5 = getelementptr inbounds %struct._Isect* %isect, i64 0, i32 0
  store float %occlusion.017.lcssa, float* %t5, align 4, !tbaa !1
  ret void
}

Notice that the loop exit block is the loop header and the latch is not guaranteed to be executed. The loop vectorizer assumes such loops have been rotated.

If we send this IR through loop-rotate it will vectorize.

The farther away we move the loop vectorizer from loop rotate the likelier some optimization will destroy the rotated from. We might just want to run loop rotate before the loop vectorizer ...

> On Oct 9, 2014, at 1:15 AM, Zinovy Nis <zinovy.nis at gmail.com> wrote:
> 
> Hi.
> 
> Did you have a chance to look at my reproducer?
> 
> 2014-10-07 21:34 GMT+04:00 Zinovy Nis <zinovy.nis at gmail.com>:
>> Hi.
>> 
>> I attached a reduced sample, based on https://code.google.com/p/aobench/.
>> 
>> Run it first with an old SLP order:
>> 
>> 1) clang -c -Ofast -static -march=core-avx2 aobench.cpp -Rpass=.
>> -mllvm -debug-only=loop-vectorize -mllvm
>> -run-slp-after-loop-vectorization=0
>> 
>> and then with a new order:
>> 
>> 2) clang -c -Ofast -static -march=core-avx2 aobench.cpp -Rpass=.
>> -debug-only=loop-vectorize -mllvm -run-slp-after-loop-vectorization=1
>> 
>> and see the logs:
>> 
>> 1) aobench.cpp:59:9: remark: vectorized loop (vectorization factor: 8,
>> unrolling interleave factor: 1) [-Rpass=loop-vectorize]
>> 2) aobench.cpp:59:9: remark: loop ***not*** vectorized: use
>> -Rpass-analysis=loop-vectorize for more info
>> [-Rpass-missed=loop-vectorize]
>> 
>> LV: Found an unidentified PHI.  %occlusion.017 = phi float [
>> 1.000000e+00, %entry ], [ %phitmp, %for.inc.for.body_crit_edge ]
>> LV: Can't vectorize the instructions or CFG
>> LV: Not vectorizing: Cannot prove legality.
>> 
>> 2014-10-06 17:46 GMT+04:00 Hal Finkel <hfinkel at anl.gov>:
>>> ----- Original Message -----
>>>> From: "Zinovy Nis" <zinovy.nis at gmail.com>
>>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>, "Tobias Grosser" <tobias at grosser.es>, "Chandler Carruth"
>>>> <chandlerc at google.com>, "Nadav Rotem" <nrotem at apple.com>, "Arnold Schwaighofer" <aschwaighofer at apple.com>
>>>> Sent: Monday, October 6, 2014 8:44:28 AM
>>>> Subject: Re: SLP/Loop vectorizer pass ordering
>>>> 
>>>> A bit later. At least GVN creates critical edges which are not
>>>> handled
>>>> by loop vectorizer then.
>>> 
>>> Okay, please do (this is fairly important) -- if you can extract some relevant IR, filing a bug report would be great. Are you saying that running SLP early inhibits GVN from creating critical edges that the loop vectorizer does not understand?
>>> 
>>> Thanks again,
>>> Hal
>>> 
>>>> 
>>>> 2014-10-06 17:33 GMT+04:00 Hal Finkel <hfinkel at anl.gov>:
>>>>> ----- Original Message -----
>>>>>> From: "Zinovy Nis" <zinovy.nis at gmail.com>
>>>>>> To: "Chandler Carruth" <chandlerc at google.com>
>>>>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>, "Tobias Grosser"
>>>>>> <tobias at grosser.es>
>>>>>> Sent: Monday, October 6, 2014 8:19:24 AM
>>>>>> Subject: Re: SLP/Loop vectorizer pass ordering
>>>>>> 
>>>>>> Please wait a while, I'm using it to revert the new order as it
>>>>>> introduces regression in our internal benchmark: SLP was creating
>>>>>> loop
>>>>>> vectorization opportunities when was called before LV. Now no such
>>>>>> opportunities are available, so we've got a regression.
>>>>> 
>>>>> Interesting. Can you provide any further details?
>>>>> 
>>>>> -Hal
>>>>> 
>>>>>> 
>>>>>> 2014-10-06 3:28 GMT+04:00 Chandler Carruth <chandlerc at google.com>:
>>>>>>> 
>>>>>>> On Thu, Sep 4, 2014 at 6:32 AM, James Molloy
>>>>>>> <james at jamesmolloy.co.uk>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Hi Hal, Chandler,
>>>>>>>> 
>>>>>>>> r217144.
>>>>>>> 
>>>>>>> 
>>>>>>> Is anyone still using the option to disable this? If I don't
>>>>>>> hear
>>>>>>> anything,
>>>>>>> I'll remove this option entirely in the next week.
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>> 
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>> 
>>>>> 
>>>>> --
>>>>> Hal Finkel
>>>>> Assistant Computational Scientist
>>>>> Leadership Computing Facility
>>>>> Argonne National Laboratory
>>>> 
>>> 
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory