[llvm-dev] IR canonicalization: shufflevector or vector trunc?

Sanjay Patel via llvm-dev llvm-dev at lists.llvm.org
Thu Jan 26 09:22:56 PST 2017


On Thu, Jan 26, 2017 at 6:11 AM, Rackover, Zvi <zvi.rackover at intel.com>
wrote:

> Thanks for providing the example, Sanjay.
>
>
>
> Here’s a case that may be worth mentioning:
>
>
>
> For the following equivalent (little-endian) functions:
>
>
>
> define i32 @A(<8 x i32> %V) {
>
>    %S = shufflevector <8 x i32> %V, <8 x i32> undef, <4 x i32> <i32 0, i32
> 2, i32 4, i32 6>
>
>    %E = extractelement <4 x i32> %S, i32 1
>
>    ret i32 %E
>
> }
>
>
>
> define i32 @B(<8 x i32> %V) {
>
>    %B = bitcast <8 x i32> %V to <4 x i64>
>
>    %T = trunc <4 x i64> %B to <4 x i32>
>
>    %E = extractelement <4 x i32> %T, i32 1
>
>    ret i32 %E
>
> }
>
>
>
> ‘opt –instcombine’ gives:
>
>
>
> define i32 @A(<8 x i32> %V) {
>
>   %E = extractelement <8 x i32> %V, i32 2
>
>   ret i32 %E
>
> }
>
>
>
> define i32 @B(<8 x i32> %V) {
>
>   %B = bitcast <8 x i32> %V to <4 x i64>
>
>   %1 = extractelement <4 x i64> %B, i32 1
>
>   %E = trunc i64 %1 to i32
>
>   ret i32 %E
>
> }
>
>
>
> Was the ‘truncate’ form in function @B combined to lesser preferable form?
>

Yes, I don't think there's any question that @A is the canonical form of
those 2 options. Excellent counter-example. :)
Filed as:
https://llvm.org/bugs/show_bug.cgi?id=31770


> Not sure if this is true because running both combined function through
> ‘llc’ on X86-64 results with identical generated code.
>

I think this is irrelevant. If there was a DAG canonicalization providing a
backstop, then we might just say, "Who cares?", but there isn't:


$ ./opt -instcombine shtr.ll -S | ./llc -o - -mtriple=powerpc64
A:                                      # @A
    mr 3, 5
    blr
B:                                      # @B
    rotldi     3, 5, 32
    rlwimi 3, 6, 0, 0, 31
    blr





> --Zvi
>
>
>
> *From:* Sanjay Patel [mailto:spatel at rotateright.com]
> *Sent:* Saturday, January 21, 2017 21:30
>
> *To:* Rackover, Zvi <zvi.rackover at intel.com>
> *Cc:* Friedman, Eli <efriedma at codeaurora.org>; llvm-dev <
> llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] IR canonicalization: shufflevector or vector
> trunc?
>
>
>
>
>
>
>
> On Thu, Jan 19, 2017 at 9:17 AM, Rackover, Zvi <zvi.rackover at intel.com>
> wrote:
>
> Hi Sanjay,
>
>
>
> I agree we should also discuss **if** this canonicalization is beneficial.
>
> For starters, do we have a concrete case where we would benefit from
> canonicalizing shuffles <-> truncates in LLVM IR?
>
> IMO, we should not count benefits for codegen because that alone does not
> justify transforming the IR ; we could always do this on the SelectionDAG.
>
>
>
> Agreed. If we're just talking about IR benefits, then it's easy to
> demonstrate a win for trunc/zext based on value tracking:
>
> target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" ;
> little-endian
>
> define <4 x i32> @shuffle(<4 x i64> %x) {
>   %y = shl <4 x i64> %x, <i64 32, i64 32, i64 32, i64 32> ; low half of
> each elt is zero
>   %bc = bitcast <4 x i64> %y to <8 x i32> ; even index elements are all
> zero
>   %trunc = shufflevector <8 x i32> %bc, <8 x i32> undef, <4 x i32> <i32 0,
> i32 2, i32 4, i32 6>
>   ret <4 x i32> %trunc
> }
>
> define <4 x i32> @trunc(<4 x i64> %x) {
>   %y = shl <4 x i64> %x <i64 32, i64 32, i64 32, i64 32> ; low half of
> each elt is zero
>   %trunc = trunc <4 x i64> %y to <4 x i32> ; so this must be zero...
>   ret <4 x i32> %trunc
> }
>
>
> $ ./opt -instsimplify 31551.ll -S
> ...
> define <4 x i32> @shuffle(<4 x i64> %x) {
>   %y = shl <4 x i64> %x, <i64 32, i64 32, i64 32, i64 32>
>   %bc = bitcast <4 x i64> %y to <8 x i32>
>   %trunc = shufflevector <8 x i32> %bc, <8 x i32> undef, <4 x i32> <i32 0,
> i32 2, i32 4, i32 6>
>   ret <4 x i32> %trunc
> }
>
> define <4 x i32> @trunc(<4 x i64> %x) {
>   ret <4 x i32> zeroinitializer
> }
>
> Of course, this is something I invented as an example, but AFAIK we have
> better value tracking for trunc/zext than shuffle, so we'll have an easier
> time folding the IR if that is possible.
>
>
>
>
>
>
> --Zvi
>
>
>
> *From:* Sanjay Patel [mailto:spatel at rotateright.com]
> *Sent:* Tuesday, January 17, 2017 18:38
>
>
> *To:* Rackover, Zvi <zvi.rackover at intel.com>
> *Cc:* Friedman, Eli <efriedma at codeaurora.org>; llvm-dev <
> llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] IR canonicalization: shufflevector or vector
> trunc?
>
>
>
> We use InstCombiner::ShouldChangeType() to prevent transforms to illegal
> integer types, but I'm not sure how that would apply to vector types.
>
> Ie, let's say v256 is a legal type in your example. DataLayout doesn't
> appear to specify what configurations of a 256-bit vector are legal, so I
> don't think we can currently use that to say v2i128 should be treated
> differently than v16i16.
>
> Is this a valid argument to not canonicalize the IR?
>
>
>
> On Mon, Jan 16, 2017 at 10:16 AM, Rackover, Zvi <zvi.rackover at intel.com>
> wrote:
>
> Suppose we prefer the ‘trunc’ form, then what about cases such as:
>
> define <2 x i16> @shuffle(<16 x i16> %x) {
>
>   %shuf = shufflevector <16 x i16> %x, <16 x i16> undef, <2 x i32> <i32 0,
> i32 8>
>
>   ret <2 x i16> %shuf
>
> }
>
>
>
> Will the ‘shufflevector’ be canonicalized to a ‘trunc’ of a vector of i128?
>
> define <2 x i16> @trunc(<16 x i16> %x) {
>
>   %bc = bitcast <16 x i16> %x to <2 x i128>
>
>   %tr = trunc <2 x i128> %bc to <2 x i16>
>
>   ret <2 x i16> %tr
>
> }
>
> This may challenge the Legalizer downstream.
>
>
>
> --Zvi
>
>
>
> *From:* Sanjay Patel [mailto:spatel at rotateright.com]
> *Sent:* Friday, January 13, 2017 18:19
> *To:* Rackover, Zvi <zvi.rackover at intel.com>
> *Cc:* Friedman, Eli <efriedma at codeaurora.org>; llvm-dev <
> llvm-dev at lists.llvm.org>
>
>
> *Subject:* Re: [llvm-dev] IR canonicalization: shufflevector or vector
> trunc?
>
>
>
> Right - I think that case looks like this for little endian:
>
> define <2 x i32> @zextshuffle(<2 x i16> %x) {
>   %zext_shuffle = shufflevector <2 x i16> %x, <2 x i16> zeroinitializer,
> <4 x i32> <i32 0, i32 2, i32 1, i32 2>
>   %bc = bitcast <4 x i16> %zext_shuffle to <2 x i32>
>   ret <2 x i32> %bc
> }
>
> define <2 x i32> @zextvec(<2 x i16> %x) {
>   %zext = zext <2 x i16> %x to <2 x i32>
>   ret <2 x i32> %zext
> }
>
> IMO, the fact that we have to take endianness into account with the
> shuffles makes the trunc/zext forms the better choice. That way, we limit
> the endian dependency to one place in InstCombine, and other transforms
> don't have to worry about it. We also have lots of existing folds for
> trunc/zext and hardly any for shuffles.
>
>
>
>
>
> On Thu, Jan 12, 2017 at 1:14 PM, Rackover, Zvi <zvi.rackover at intel.com>
> wrote:
>
> Just to add, there is also the ‘zext’ – ‘shuffle with zero’ duality which
> can broaden the discussion.
>
>
>
> --Zvi
>
>
>
> *From:* Sanjay Patel [mailto:spatel at rotateright.com]
> *Sent:* Thursday, January 12, 2017 20:19
> *To:* Friedman, Eli <efriedma at codeaurora.org>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>; Rackover, Zvi <
> zvi.rackover at intel.com>
> *Subject:* Re: [llvm-dev] IR canonicalization: shufflevector or vector
> trunc?
>
>
>
>
>
>
>
> On Thu, Jan 12, 2017 at 11:06 AM, Friedman, Eli <efriedma at codeaurora.org>
> wrote:
>
> On 1/12/2017 9:04 AM, Sanjay Patel via llvm-dev wrote:
>
> It's time for another round of "What is the canonical IR?"
>
> Credit for this episode to Zvi and PR31551. :)
> https://llvm.org/bugs/show_bug.cgi?id=31551
>
> define <4 x i16> @shuffle(<16 x i16> %x) {
>
>   %shuf = shufflevector <16 x i16> %x, <16 x i16> undef, <4 x i32> <i32 0, i32 4, i32 8, i32 12>
>
>   ret <4 x i16> %shuf
>
> }
>
>
>
> define <4 x i16> @trunc(<16 x i16> %x) {
>
>   %bc = bitcast <16 x i16> %x to <4 x i64>
>
>   %tr = trunc <4 x i64> %bc to <4 x i16>
>
>   ret <4 x i16> %tr
>
> }
>
>
>
> Potential reasons to prefer one or the other:
> 1. Shuffle is the most compact.
> 2. Trunc is easier to read.
> 3. One of these is easier for value tracking.
> 4. Compatibility with existing IR transforms (eg, InterleavedAccess
> recognizes the shuffle form).
>
> 5. We don't create arbitrary shuffle masks in IR because that's bad for a
> lot of targets (but maybe this mask pattern should always be recognized as
> special?).
>
>
>
> Hmm... not sure what the right answer is, but a couple more observations:
> 1. If we're going to canonicalize, we should probably canonicalize the
> same way independent of the original argument type (so we would introduce
> bitcasts either way).
>
>
>
> Ah, right - kill #1 in my list.
>
>
>
> 2. Those two functions are only equivalent on little-endian platforms.
>
>
>
> I was wondering about that. So yes, if we do want to canonicalize (until
> the recent compile-time complaints, I always thought this was the objective
> of InstCombine...maybe it still is), then the masks we're matching or
> generating will differ based on endianness.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170126/4c21f7c6/attachment.html>


More information about the llvm-dev mailing list