[llvm-commits] Please review - One more shuffle optimization for AVX

Craig Topper craig.topper at gmail.com
Mon Jun 25 21:42:31 PDT 2012


+static
+SDValue Compact8x32ShuffleNode(ShuffleVectorSDNode *SVOp,
+                            SelectionDAG &DAG) {

Please align the second line with the parentheses on the first. Also remove
trailing space after static.

+  if (VT == MVT::v8i32 || VT == MVT::v8f32) {

Probably better to reverse the condition and early out instead of making
almost the entire function be inside of an if statement.

Other than those things. LGTM.


On Mon, Jun 25, 2012 at 2:32 AM, Demikhovsky, Elena <
elena.demikhovsky at intel.com> wrote:

>  This is the new patch. All my comments/answers bellow.****
>
> ** **
>
> Thank you for the review.****
>
> ** **
>
> *- Elena*****
>
> *From:* Craig Topper [mailto:craig.topper at gmail.com]
> *Sent:* Monday, June 25, 2012 00:24
> *To:* Rotem, Nadav
> *Cc:* Demikhovsky, Elena; llvm-commits at cs.uiuc.edu
> *Subject:* Re: [llvm-commits] Please review - One more shuffle
> optimization for AVX****
>
> ** **
>
> ** **
>
> On Sun, Jun 24, 2012 at 7:10 AM, Rotem, Nadav <nadav.rotem at intel.com>
> wrote:****
>
> CHECK: test18
> +; CHECK: vshufps
> +; CHECK: vshufps
> +; CHECK: vunpcklps
>
> Check for 'ret' at the end of the test.****
>
> *[Demikhovsky, Elena] Done*
>
>
>
>                                    DebugLoc dl) {
> -  SDValue V = Insert128BitVector(DAG.getUNDEF(VT), V1, 0, DAG, dl);
> -  return Insert128BitVector(V, V2, NumElems/2, DAG, dl);
> +  SDValue V = DAG.getNode(ISD::UNDEF, dl, VT);
> +
> +  if (V1.getOpcode() != ISD::UNDEF)
> +    V = Insert128BitVector(V, V1, 0, DAG, dl);
> +
> +  if (V2.getOpcode() != ISD::UNDEF)
> +    V = Insert128BitVector(V, V2, NumElems/2, DAG, dl);
> +
> +  return V;
>  }
>
> No need to do this. Craig changed Insert128BitVector so that it checks for
> undef values.
>
> ****
>
> *[Demikhovsky, Elena] Done*
>
>
> +//
> +// Some special combinations that can be optimized
> +//
>
> What is special about these combinations ? ****
>
> *[Demikhovsky, Elena] These combinations are frequently seen in many
> benchmarks. This optimization gives 10-20% on many WOLF benchmarks.*
>
> Period at the end of the sentence.  ****
>
> *[Demikhovsky, Elena] Done*
>
> Why is this function called Compact8x32ShuffleNode ?****
>
> *[Demikhovsky, Elena] You can suggest another name.*
>
>
>
> +  if (VT.is256BitVector() && (NumElts == 8)) {
>
> You can check that VT = v8i32;****
>
> *[Demikhovsky, Elena] I changed to (VT==v8i32 || VT==v8f32)*
>
>
>
> +    ArrayRef<int> Mask = SVOp->getMask();
> +    if (isUndefOrEqual(Mask[0], 0) &&
> +        isUndefOrEqual(Mask[1], 8) &&
> +        isUndefOrEqual(Mask[2], 2) &&
> +        isUndefOrEqual(Mask[3], 10) &&
> +        isUndefOrEqual(Mask[4], 4) &&
> +        isUndefOrEqual(Mask[5], 12) &&
> +        isUndefOrEqual(Mask[6], 6) &&
> +        isUndefOrEqual(Mask[7], 14)) {
>
> Please create a local array and iterate over it in a loop. Calling a
> function 16 times bloats the code.****
>
> *[Demikhovsky, Elena] Done*
>
>
>
> +      int CompactionMask[] = {0, 2, -1, -1, 4, 6, -1, -1};
> +      SDValue Op0 = DAG.getVectorShuffle(VT, dl, SVOp->getOperand(0),
> +        DAG.getNode(ISD::UNDEF, dl, VT), CompactionMask);
> +      SDValue Op1 = DAG.getVectorShuffle(VT, dl, SVOp->getOperand(1),
> +        DAG.getNode(ISD::UNDEF, dl, VT), CompactionMask);
> +      int UnpackMask[] = {0, 8, 1, 9, 4, 12, 5, 13};
>
> Undef can be created once, not 4 times.****
>
> *[Demikhovsky, Elena] Done*****
>
>
> Also mark those arrays as static const.****
>
> *[Demikhovsky, Elena] Done*
>
>
>  ****
>
>
>
> +}
> +
> +
>
> Remove the extra line breaks.****
>
> *[Demikhovsky, Elena] Done*****
>
>
> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:
> llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Demikhovsky, Elena
> Sent: Sunday, June 24, 2012 15:30
> To: llvm-commits at cs.uiuc.edu
> Subject: [llvm-commits] Please review - One more shuffle optimization for
> AVX
>
> Hi,
>
> I have a bunch of optimizations for AVX and AVX2 code that I recently did.
> Most of them show significant performance speedup on real workloads.
> I'll send it to review one by one accompanied with appropriate tests.
>
> The current patch optimizes frequently used shuffle patterns and gives
> these instruction sequence reduction.
> Before:
>       vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
>        vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
>        vextractf128    $1, %ymm1, %xmm1
>        vextractf128    $1, %ymm0, %xmm0
>        vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
>        vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
>        vinsertf128     $1, %xmm0, %ymm2, %ymm0
> After:
>       vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 =
> ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
>       vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
>       vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 =
> ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
>
>  Thank you
>
> - Elena
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for the
> sole use of the intended recipient(s). Any review or distribution by others
> is strictly prohibited. If you are not the intended recipient, please
> contact the sender and delete all copies.
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
> ****
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits****
>
>
>
>
> --
> ~Craig****
>  ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>



-- 
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120625/460c68a2/attachment.html>


More information about the llvm-commits mailing list