[PATCH] Calculate vectorization factor using the narrowest type instead of widest type

Tue Apr 14 08:46:23 PDT 2015

On Tue, Apr 14, 2015 at 8:35 AM James Molloy <james at jamesmolloy.co.uk>
wrote:

> Hi Sanjay,
>
> Actually Chandler had the opposite opinion, and I at least have been
> pushing the ARM backends into q shape that they can deal gracefully with
> oversize vectors. The work I've done should affect all backends equally,
> actually.
>

Yes, and I'm really excited about this.

I think oversized vectors, provided they're formed reasonably carefully,
are a good way to express the generic "widening" of operations.

At some point, I think we should be able to remove the unrolling logic from
the loop vectorizer and replace it with over-wide vectors, but that
requires truly superb lowering.

>
> I'd be interested in what others' opinions are here, because I've heard
> both arguments and both seem valid to me!
>
> Cheers,
>
> James
> On Tue, 14 Apr 2015 at 05:33, Sanjay Patel <spatel at rotateright.com> wrote:
>
>> > If I understand this correctly, this will cause us to potentially
>> generate wider vectors than we have underlying vector registers...
>>
>> This reminded me of a question I asked on the dev list a few weeks ago
>> related to this bug:
>> https://llvm.org/bugs/show_bug.cgi?id=20225
>>
>> One of the conclusions I came to was:
>> "The loop vectorizer shouldn't be so eager to generate a
>> larger-than-legal type."
>>
>> I don't think there's been much effort trying to optimize super-wide
>> shuffles.
>>
>> On Sat, Apr 11, 2015 at 10:51 PM, Nadav Rotem <nrotem at apple.com> wrote:
>>
>>>
>>> > On Apr 11, 2015, at 6:41 PM, hfinkel at anl.gov wrote:
>>> >
>>> > [+Arnold, Nadav,Chandler]
>>>
>>> The loop vectorizer chooses a vectorization factor using a cost model
>>> and we stop the search at the widest type to limit cross-register shuffles.
>>> It is very difficult to model the performance impact of cross-register
>>> shuffles (both the quality of the shuffles that we generate and the
>>> performance impact of using multiple registers on register pressure).
>>>
>>>  Using the narrowest element type would increase the in-register
>>> utilization but decrease the utilization of execution units due to
>>> unrolling, so I am not sure what we are gaining here (except for extra
>>> shuffles).
>>>
>>> >
>>> > If I understand this correctly, this will cause us to potentially
>>> generate wider vectors than we have underlying vector registers, and I
>>> think that, generically, this makes sense. Now that our X86 shuffle
>>> handling is sane, the splitting of wide vectors, and shuffling that you get
>>> from vector extends/truncates is hopefully not too bad. Other opinions?
>>> >
>>> > Did you see any performance changes on the test suite?
>>> >
>>> > We might need to update the register-pressure heuristic
>>> (LoopVectorizationCostModel::calculateRegisterUsage()) to understand that
>>> very-wide vectors use multiple vector registers.
>>>
>>> I agree.  Additionally, I think that we should rewrite this code. I made
>>> the mistake of calculating register pressure by scanning the code forward
>>> (start to end) instead of backwards, which made the code unnecessarily
>>> complicated (liveness should be calculated by scanning the basic block
>>> backwards).
>>>
>>> >
>>> >
>>> > http://reviews.llvm.org/D8943
>>> >
>>> > EMAIL PREFERENCES
>>> >  http://reviews.llvm.org/settings/panel/emailpreferences/
>>> >
>>> >
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150414/c49f1d11/attachment.html>