[llvm-commits] Please review - One more shuffle optimization for AVX

Craig Topper craig.topper at gmail.com
Sun Jun 24 14:24:17 PDT 2012


On Sun, Jun 24, 2012 at 7:10 AM, Rotem, Nadav <nadav.rotem at intel.com> wrote:

> CHECK: test18
> +; CHECK: vshufps
> +; CHECK: vshufps
> +; CHECK: vunpcklps
>
> Check for 'ret' at the end of the test.
>
>                                    DebugLoc dl) {
> -  SDValue V = Insert128BitVector(DAG.getUNDEF(VT), V1, 0, DAG, dl);
> -  return Insert128BitVector(V, V2, NumElems/2, DAG, dl);
> +  SDValue V = DAG.getNode(ISD::UNDEF, dl, VT);
> +
> +  if (V1.getOpcode() != ISD::UNDEF)
> +    V = Insert128BitVector(V, V1, 0, DAG, dl);
> +
> +  if (V2.getOpcode() != ISD::UNDEF)
> +    V = Insert128BitVector(V, V2, NumElems/2, DAG, dl);
> +
> +  return V;
>  }
>
> No need to do this. Craig changed Insert128BitVector so that it checks for
> undef values.
>
> +//
> +// Some special combinations that can be optimized
> +//
>
> What is special about these combinations ? Period at the end of the
> sentence.  Why is this function called Compact8x32ShuffleNode ?
>
> +  if (VT.is256BitVector() && (NumElts == 8)) {
>
> You can check that VT = v8i32;
>
> +    ArrayRef<int> Mask = SVOp->getMask();
> +    if (isUndefOrEqual(Mask[0], 0) &&
> +        isUndefOrEqual(Mask[1], 8) &&
> +        isUndefOrEqual(Mask[2], 2) &&
> +        isUndefOrEqual(Mask[3], 10) &&
> +        isUndefOrEqual(Mask[4], 4) &&
> +        isUndefOrEqual(Mask[5], 12) &&
> +        isUndefOrEqual(Mask[6], 6) &&
> +        isUndefOrEqual(Mask[7], 14)) {
>
> Please create a local array and iterate over it in a loop. Calling a
> function 16 times bloats the code.
>
> +      int CompactionMask[] = {0, 2, -1, -1, 4, 6, -1, -1};
> +      SDValue Op0 = DAG.getVectorShuffle(VT, dl, SVOp->getOperand(0),
> +        DAG.getNode(ISD::UNDEF, dl, VT), CompactionMask);
> +      SDValue Op1 = DAG.getVectorShuffle(VT, dl, SVOp->getOperand(1),
> +        DAG.getNode(ISD::UNDEF, dl, VT), CompactionMask);
> +      int UnpackMask[] = {0, 8, 1, 9, 4, 12, 5, 13};
>
> Undef can be created once, not 4 times.
>

Also mark those arrays as static const.


>
>
> +}
> +
> +
>
> Remove the extra line breaks.
>
> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:
> llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Demikhovsky, Elena
> Sent: Sunday, June 24, 2012 15:30
> To: llvm-commits at cs.uiuc.edu
> Subject: [llvm-commits] Please review - One more shuffle optimization for
> AVX
>
> Hi,
>
> I have a bunch of optimizations for AVX and AVX2 code that I recently did.
> Most of them show significant performance speedup on real workloads.
> I'll send it to review one by one accompanied with appropriate tests.
>
> The current patch optimizes frequently used shuffle patterns and gives
> these instruction sequence reduction.
> Before:
>       vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
>        vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
>        vextractf128    $1, %ymm1, %xmm1
>        vextractf128    $1, %ymm0, %xmm0
>        vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
>        vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
>        vinsertf128     $1, %xmm0, %ymm2, %ymm0
> After:
>       vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 =
> ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
>       vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
>       vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 =
> ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
>
>  Thank you
>
> - Elena
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for the
> sole use of the intended recipient(s). Any review or distribution by others
> is strictly prohibited. If you are not the intended recipient, please
> contact the sender and delete all copies.
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>



-- 
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120624/2ad44e30/attachment.html>


More information about the llvm-commits mailing list