[llvm] r180802 - InstCombine: Fold more shuffles of shuffles.
Hal Finkel
hfinkel at anl.gov
Tue Apr 30 14:11:17 PDT 2013
Nadav,
Can this be dealt with in the backend? It seems that this is the better canonical form.
-Hal
----- Original Message -----
> From: "Nadav Rotem" <nrotem at apple.com>
> To: "Jim Grosbach" <grosbach at apple.com>
> Cc: llvm-commits at cs.uiuc.edu
> Sent: Tuesday, April 30, 2013 4:00:57 PM
> Subject: Re: [llvm] r180802 - InstCombine: Fold more shuffles of shuffles.
>
>
> Hi Jim,
>
>
> AVX does not allow cross-lane shuffles within a single vector. Your
> change can potentially introduce regressions for hand-written code
> that assumes the current LLVM behavior.
>
>
> Thanks,
> NAdav
>
>
>
> On Apr 30, 2013, at 1:43 PM, Jim Grosbach < grosbach at apple.com >
> wrote:
>
>
>
> Author: grosbach
> Date: Tue Apr 30 15:43:52 2013
> New Revision: 180802
>
> URL: http://llvm.org/viewvc/llvm-project?rev=180802&view=rev
> Log:
> InstCombine: Fold more shuffles of shuffles.
>
> Always fold a shuffle-of-shuffle into a single shuffle when there's
> only one
> input vector in the first place. Continue to be more conservative
> when there's
> multiple inputs.
>
> rdar://13402653
> PR15866
>
> Modified:
> llvm/trunk/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
> llvm/trunk/test/Transforms/BBVectorize/simple.ll
> llvm/trunk/test/Transforms/InstCombine/vec_shuffle.ll
>
> Modified:
> llvm/trunk/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineVectorOps.cpp?rev=180802&r1=180801&r2=180802&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
> (original)
> +++ llvm/trunk/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
> Tue Apr 30 15:43:52 2013
> @@ -614,11 +614,16 @@ Instruction *InstCombiner::visitShuffleV
> // we are absolutely afraid of producing a shuffle mask not in the
> input
> // program, because the code gen may not be smart enough to turn a
> merged
> // shuffle into two specific shuffles: it may produce worse code. As
> such,
> - // we only merge two shuffles if the result is either a splat or
> one of the
> - // input shuffle masks. In this case, merging the shuffles just
> removes
> - // one instruction, which we know is safe. This is good for things
> like
> + // we only merge two shuffles if the result is a splat, one of the
> input
> + // input shuffle masks, or if there's only one input to the
> shuffle.
> + // In this case, merging the shuffles just removes one instruction,
> which
> + // we know is safe. This is good for things like
> // turning: (splat(splat)) -> splat, or
> // merge(V[0..n], V[n+1..2n]) -> V[0..2n]
> + //
> + // FIXME: This is almost certainly far, far too conservative. We
> should
> + // have a better model. Perhaps a TargetTransformInfo hook to ask
> whether
> + // a shuffle is considered OK?
> ShuffleVectorInst* LHSShuffle = dyn_cast<ShuffleVectorInst>(LHS);
> ShuffleVectorInst* RHSShuffle = dyn_cast<ShuffleVectorInst>(RHS);
> if (LHSShuffle)
> @@ -743,8 +748,10 @@ Instruction *InstCombiner::visitShuffleV
> }
>
> // If the result mask is equal to one of the original shuffle masks,
> - // or is a splat, do the replacement.
> - if (isSplat || newMask == LHSMask || newMask == RHSMask || newMask
> == Mask) {
> + // or is a splat, do the replacement. Similarly, if there is only
> one
> + // input vector, go ahead and do the folding.
> + if (isSplat || newMask == LHSMask || newMask == RHSMask || newMask
> == Mask ||
> + isa<UndefValue>(RHS)) {
> SmallVector<Constant*, 16> Elts;
> Type *Int32Ty = Type::getInt32Ty(SVI.getContext());
> for (unsigned i = 0, e = newMask.size(); i != e; ++i) {
>
> Modified: llvm/trunk/test/Transforms/BBVectorize/simple.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/BBVectorize/simple.ll?rev=180802&r1=180801&r2=180802&view=diff
> ==============================================================================
> --- llvm/trunk/test/Transforms/BBVectorize/simple.ll (original)
> +++ llvm/trunk/test/Transforms/BBVectorize/simple.ll Tue Apr 30
> 15:43:52 2013
> @@ -139,11 +139,10 @@ define <8 x i8> @test6(<8 x i8> %A1, <8
> ; CHECK: %Z1 = add <16 x i8> %Y1, %X1.v.i1
> %Q1 = shufflevector <8 x i8> %Z1, <8 x i8> %Z2, <8 x i32> <i32 15,
> i32 8, i32 6, i32 1, i32 13, i32 10, i32 4, i32 3>
> %Q2 = shufflevector <8 x i8> %Z2, <8 x i8> %Z2, <8 x i32> <i32 6, i32
> 7, i32 0, i32 1, i32 2, i32 4, i32 4, i32 1>
> -; CHECK: %Q1.v.i1 = shufflevector <16 x i8> %Z1, <16 x i8> undef,
> <16 x i32> <i32 8, i32 undef, i32 10, i32 undef, i32 undef, i32 13,
> i32 undef, i32 15, i32 undef, i32 undef, i32 undef, i32 undef, i32
> undef, i32 undef, i32 undef, i32 undef>
> -; CHECK: %Q1 = shufflevector <16 x i8> %Z1, <16 x i8> %Q1.v.i1, <16
> x i32> <i32 23, i32 16, i32 6, i32 1, i32 21, i32 18, i32 4, i32 3,
> i32 14, i32 15, i32 8, i32 9, i32 10, i32 12, i32 12, i32 9>
> - %R = mul <8 x i8> %Q1, %Q2
> -; CHECK: %Q1.v.r1 = shufflevector <16 x i8> %Q1, <16 x i8> undef, <8
> x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
> -; CHECK: %Q1.v.r2 = shufflevector <16 x i8> %Q1, <16 x i8> undef, <8
> x i32> <i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32
> 15>
> + %R = mul <8 x i8> %Q1, %Q2
> +; CHECK: %Q1.v.i1 = shufflevector <16 x i8> %Z1, <16 x i8> undef,
> <16 x i32> <i32 8, i32 undef, i32 10, i32 undef, i32 undef, i32 13,
> i32 undef, i32 15, i32 undef, i32 undef, i32 undef, i32 undef, i32
> undef, i32 undef, i32 undef, i32 undef>
> +; CHECK: %Q1.v.r1 = shufflevector <16 x i8> %Z1, <16 x i8> %Q1.v.i1,
> <8 x i32> <i32 23, i32 16, i32 6, i32 1, i32 21, i32 18, i32 4, i32
> 3>
> +; CHECK: %Q1.v.r2 = shufflevector <16 x i8> %Z1, <16 x i8> undef, <8
> x i32> <i32 14, i32 15, i32 8, i32 9, i32 10, i32 12, i32 12, i32 9>
> ; CHECK: %R = mul <8 x i8> %Q1.v.r1, %Q1.v.r2
> ret <8 x i8> %R
> ; CHECK: ret <8 x i8> %R
>
> Modified: llvm/trunk/test/Transforms/InstCombine/vec_shuffle.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/vec_shuffle.ll?rev=180802&r1=180801&r2=180802&view=diff
> ==============================================================================
> --- llvm/trunk/test/Transforms/InstCombine/vec_shuffle.ll (original)
> +++ llvm/trunk/test/Transforms/InstCombine/vec_shuffle.ll Tue Apr 30
> 15:43:52 2013
> @@ -86,14 +86,14 @@ define <4 x i8> @test9(<16 x i8> %tmp6)
> }
>
> ; Same as test9, but make sure that "undef" mask values are not
> confused with
> -; mask values of 2*N, where N is the mask length. These shuffles
> should not
> -; be folded (because [8,9,4,8] may not be a mask supported by the
> target).
> -define <4 x i8> @test9a(<16 x i8> %tmp6) nounwind {
> +; mask values of 2*N, where N is the mask length of the result. Make
> sure when
> +; folding these shuffles that 'undef' mask values stay that way in
> the result
> +; instead of getting mapped to the 2*N'th entry of the source.
> +define <4 x i8> @test9a(<16 x i8> %in, <16 x i8> %in2) nounwind {
> ; CHECK: @test9a
> -; CHECK-NEXT: shufflevector
> -; CHECK-NEXT: shufflevector
> +; CHECK-NEXT: shufflevector <16 x i8> %in, <16 x i8> %in2, <4 x i32>
> <i32 16, i32 9, i32 4, i32 undef>
> ; CHECK-NEXT: ret
> - %tmp7 = shufflevector <16 x i8> %tmp6, <16 x i8> undef, <4 x i32> <
> i32 undef, i32 9, i32 4, i32 8 > ; <<4 x i8>> [#uses=1]
> + %tmp7 = shufflevector <16 x i8> %in, <16 x i8> %in2, <4 x i32> <
> i32 undef, i32 9, i32 4, i32 16 > ; <<4 x i8>> [#uses=1]
> %tmp9 = shufflevector <4 x i8> %tmp7, <4 x i8> undef, <4 x i32> < i32
> 3, i32 1, i32 2, i32 0 > ; <<4 x i8>> [#uses=1]
> ret <4 x i8> %tmp9
> }
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
More information about the llvm-commits
mailing list