[llvm-dev] IR canonicalization: vector select or shufflevector?

Martin J. O'Riordan via llvm-dev llvm-dev at lists.llvm.org
Mon Aug 29 12:34:47 PDT 2016


I must admit, I prefer the shuffle canonicalization, but mainly because we have put a lot of effort into finding optimal instruction sequences for obscure shuffle patterns.  But we could refactor easily enough to use either.

 

I don’t know which makes the most logical sense in this case though.  Certainly choosing the select pattern better matches OpenCL’s native select interface.

 

            MartinO

 

From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Michael Kuperstein via llvm-dev
Sent: 29 August 2016 19:28
To: Philip Reames <listmail at philipreames.com>
Cc: llvm-dev <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] IR canonicalization: vector select or shufflevector?

 

I have a slight preference towards shufflevector, because it makes sequences of shuffles, where only some of the shuffles can be converted into selects (because the input and output vector sizes of the others don't match) simpler to reason about.

 

I'm not sure this is a particularly good reason, though.

 

On Mon, Aug 29, 2016 at 8:19 AM, Philip Reames via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote:

I don't have a strong preference, though it is clear we should pick one.  I'd mildly prefer the select form for readability.  From an optimization standpoint, I see reasonable arguments for either.  

Philip

 

On 08/28/2016 12:37 PM, Sanjay Patel via llvm-dev wrote:

A vector select with a constant vector condition operand:

define <4 x i32> @foo(<4 x i32> %a, <4 x i32> %b) {
  %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b
  ret <4 x i32> %sel
}


...is equivalent to a shufflevector:

define <4 x i32> @goo(<4 x i32> %a, <4 x i32> %b) {
  %shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3>
  ret <4 x i32> %shuf
}


For the goal of canonicalization in IR, which of these should we prefer? Some backend / lowering differences for AArch64 and PPC are noted in:
https://llvm.org/bugs/show_bug.cgi?id=28530
https://llvm.org/bugs/show_bug.cgi?id=28531

x86 converts either form optimally in all cases I've looked at.


This question first came up in D22114 ( https://reviews.llvm.org/D22114 ) and is extended in D23886 ( https://reviews.llvm.org/D23886 ) with a constant value example.

 

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> 
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

 


_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> 
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/efa881d4/attachment.html>


More information about the llvm-dev mailing list