[llvm-dev] IR canonicalization: vector select or shufflevector?

Sat Sep 17 13:16:48 PDT 2016

We're canonicalizing to shufflevector with:
https://reviews.llvm.org/rL281787

Please let me know if you see any performance regressions from this change.

On Mon, Aug 29, 2016 at 4:57 PM, Hal Finkel <hfinkel at anl.gov> wrote:

>
>
> ------------------------------
>
> *From: *"Sanjay Patel via llvm-dev" <llvm-dev at lists.llvm.org>
> *To: *"Martin ORiordan" <Martin.ORiordan at movidius.com>
> *Cc: *"LLVM Developers" <llvm-dev at lists.llvm.org>
> *Sent: *Monday, August 29, 2016 5:45:51 PM
> *Subject: *Re: [llvm-dev] IR canonicalization: vector select or
> shufflevector?
>
> x86 has also put a lot of effort into shuffle lowering...so much so that
> it is its own life-form and brings most online codeviewer apps to their
> knees when you try to open X86ISelLowering.cpp. :)
>
> Given that:
> 1. There are at least 2 targets that lean towards shuffle (Martin's
> comment + x86 uses lowerVSELECTtoVectorShuffle() for all cases like the
> example posted here)
>
> This is irrelevant, as such. We can always transform these into shuffle
> SDAG nodes regardless of how they look in the IR.
>
> That having been said, I'm fine with choosing shuffles as the canonical
> form, over selects with constant vector conditions - If we don't, we'd need
> some utility to abstract away the difference regardless.
>
>  -Hal
>
> 2. Size-changing shuffles are easier to reason about with other shuffles
> (Michael's comment)
> 3. Insert/extract are easier to reason about with shuffles (Eli's comment
> in D22114)
>
> ...we should probably go with shuffle as the canonical encoding. Like
> Philip, I think the select is easier to read in IR (and mentally translate
> to an x86 'blend'), but there's no other advantage for select?
>
> I'll give this thread some more time before posting a patch...in case
> we've missed something.
>
>
>
> On Mon, Aug 29, 2016 at 1:34 PM, Martin J. O'Riordan via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I must admit, I prefer the shuffle canonicalization, but mainly because
>> we have put a lot of effort into finding optimal instruction sequences for
>> obscure shuffle patterns.  But we could refactor easily enough to use
>> either.
>>
>>
>>
>> I don’t know which makes the most logical sense in this case though.
>> Certainly choosing the select pattern better matches OpenCL’s native select
>> interface.
>>
>>
>>
>>             MartinO
>>
>>
>>
>> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *Michael
>> Kuperstein via llvm-dev
>> *Sent:* 29 August 2016 19:28
>> *To:* Philip Reames <listmail at philipreames.com>
>> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>
>> *Subject:* Re: [llvm-dev] IR canonicalization: vector select or
>> shufflevector?
>>
>>
>>
>> I have a slight preference towards shufflevector, because it makes
>> sequences of shuffles, where only some of the shuffles can be converted
>> into selects (because the input and output vector sizes of the others don't
>> match) simpler to reason about.
>>
>>
>>
>> I'm not sure this is a particularly good reason, though.
>>
>>
>>
>> On Mon, Aug 29, 2016 at 8:19 AM, Philip Reames via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> I don't have a strong preference, though it is clear we should pick one.
>> I'd mildly prefer the select form for readability.  From an optimization
>> standpoint, I see reasonable arguments for either.
>>
>> Philip
>>
>>
>>
>> On 08/28/2016 12:37 PM, Sanjay Patel via llvm-dev wrote:
>>
>> A vector select with a constant vector condition operand:
>>
>> define <4 x i32> @foo(<4 x i32> %a, <4 x i32> %b) {
>>   %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x
>> i32> %a, <4 x i32> %b
>>   ret <4 x i32> %sel
>> }
>>
>>
>> ...is equivalent to a shufflevector:
>>
>> define <4 x i32> @goo(<4 x i32> %a, <4 x i32> %b) {
>>   %shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32
>> 5, i32 6, i32 3>
>>   ret <4 x i32> %shuf
>> }
>>
>>
>> For the goal of canonicalization in IR, which of these should we prefer?
>> Some backend / lowering differences for AArch64 and PPC are noted in:
>> https://llvm.org/bugs/show_bug.cgi?id=28530
>> https://llvm.org/bugs/show_bug.cgi?id=28531
>>
>> x86 converts either form optimally in all cases I've looked at.
>>
>>
>> This question first came up in D22114 ( https://reviews.llvm.org/D22114
>> ) and is extended in D23886 ( https://reviews.llvm.org/D23886 ) with a
>> constant value example.
>>
>>
>>
>> _______________________________________________
>>
>> LLVM Developers mailing list
>>
>> llvm-dev at lists.llvm.org
>>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160917/2b45f74b/attachment.html>