[llvm-commits] Please review - One more shuffle optimization for AVX

Craig Topper craig.topper at gmail.com
Mon Jun 25 21:54:35 PDT 2012


I suppose MaskToOptimize_Even should have the underscore removed to match
coding standards.

On Mon, Jun 25, 2012 at 9:42 PM, Craig Topper <craig.topper at gmail.com>wrote:

> +static
> +SDValue Compact8x32ShuffleNode(ShuffleVectorSDNode *SVOp,
> +                            SelectionDAG &DAG) {
>
> Please align the second line with the parentheses on the first. Also
> remove trailing space after static.
>
> +  if (VT == MVT::v8i32 || VT == MVT::v8f32) {
>
> Probably better to reverse the condition and early out instead of making
> almost the entire function be inside of an if statement.
>
> Other than those things. LGTM.
>
>
>
> On Mon, Jun 25, 2012 at 2:32 AM, Demikhovsky, Elena <
> elena.demikhovsky at intel.com> wrote:
>
>>  This is the new patch. All my comments/answers bellow.****
>>
>> ** **
>>
>> Thank you for the review.****
>>
>> ** **
>>
>> *- Elena*****
>>
>> *From:* Craig Topper [mailto:craig.topper at gmail.com]
>> *Sent:* Monday, June 25, 2012 00:24
>> *To:* Rotem, Nadav
>> *Cc:* Demikhovsky, Elena; llvm-commits at cs.uiuc.edu
>> *Subject:* Re: [llvm-commits] Please review - One more shuffle
>> optimization for AVX****
>>
>> ** **
>>
>> ** **
>>
>> On Sun, Jun 24, 2012 at 7:10 AM, Rotem, Nadav <nadav.rotem at intel.com>
>> wrote:****
>>
>> CHECK: test18
>> +; CHECK: vshufps
>> +; CHECK: vshufps
>> +; CHECK: vunpcklps
>>
>> Check for 'ret' at the end of the test.****
>>
>> *[Demikhovsky, Elena] Done*
>>
>>
>>
>>                                    DebugLoc dl) {
>> -  SDValue V = Insert128BitVector(DAG.getUNDEF(VT), V1, 0, DAG, dl);
>> -  return Insert128BitVector(V, V2, NumElems/2, DAG, dl);
>> +  SDValue V = DAG.getNode(ISD::UNDEF, dl, VT);
>> +
>> +  if (V1.getOpcode() != ISD::UNDEF)
>> +    V = Insert128BitVector(V, V1, 0, DAG, dl);
>> +
>> +  if (V2.getOpcode() != ISD::UNDEF)
>> +    V = Insert128BitVector(V, V2, NumElems/2, DAG, dl);
>> +
>> +  return V;
>>  }
>>
>> No need to do this. Craig changed Insert128BitVector so that it checks
>> for undef values.
>>
>> ****
>>
>> *[Demikhovsky, Elena] Done*
>>
>>
>> +//
>> +// Some special combinations that can be optimized
>> +//
>>
>> What is special about these combinations ? ****
>>
>> *[Demikhovsky, Elena] These combinations are frequently seen in many
>> benchmarks. This optimization gives 10-20% on many WOLF benchmarks.*
>>
>> Period at the end of the sentence.  ****
>>
>> *[Demikhovsky, Elena] Done*
>>
>> Why is this function called Compact8x32ShuffleNode ?****
>>
>> *[Demikhovsky, Elena] You can suggest another name.*
>>
>>
>>
>> +  if (VT.is256BitVector() && (NumElts == 8)) {
>>
>> You can check that VT = v8i32;****
>>
>> *[Demikhovsky, Elena] I changed to (VT==v8i32 || VT==v8f32)*
>>
>>
>>
>> +    ArrayRef<int> Mask = SVOp->getMask();
>> +    if (isUndefOrEqual(Mask[0], 0) &&
>> +        isUndefOrEqual(Mask[1], 8) &&
>> +        isUndefOrEqual(Mask[2], 2) &&
>> +        isUndefOrEqual(Mask[3], 10) &&
>> +        isUndefOrEqual(Mask[4], 4) &&
>> +        isUndefOrEqual(Mask[5], 12) &&
>> +        isUndefOrEqual(Mask[6], 6) &&
>> +        isUndefOrEqual(Mask[7], 14)) {
>>
>> Please create a local array and iterate over it in a loop. Calling a
>> function 16 times bloats the code.****
>>
>> *[Demikhovsky, Elena] Done*
>>
>>
>>
>> +      int CompactionMask[] = {0, 2, -1, -1, 4, 6, -1, -1};
>> +      SDValue Op0 = DAG.getVectorShuffle(VT, dl, SVOp->getOperand(0),
>> +        DAG.getNode(ISD::UNDEF, dl, VT), CompactionMask);
>> +      SDValue Op1 = DAG.getVectorShuffle(VT, dl, SVOp->getOperand(1),
>> +        DAG.getNode(ISD::UNDEF, dl, VT), CompactionMask);
>> +      int UnpackMask[] = {0, 8, 1, 9, 4, 12, 5, 13};
>>
>> Undef can be created once, not 4 times.****
>>
>> *[Demikhovsky, Elena] Done*****
>>
>>
>> Also mark those arrays as static const.****
>>
>> *[Demikhovsky, Elena] Done*
>>
>>
>>  ****
>>
>>
>>
>> +}
>> +
>> +
>>
>> Remove the extra line breaks.****
>>
>> *[Demikhovsky, Elena] Done*****
>>
>>
>> -----Original Message-----
>> From: llvm-commits-bounces at cs.uiuc.edu [mailto:
>> llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Demikhovsky, Elena
>> Sent: Sunday, June 24, 2012 15:30
>> To: llvm-commits at cs.uiuc.edu
>> Subject: [llvm-commits] Please review - One more shuffle optimization for
>> AVX
>>
>> Hi,
>>
>> I have a bunch of optimizations for AVX and AVX2 code that I recently
>> did. Most of them show significant performance speedup on real workloads.
>> I'll send it to review one by one accompanied with appropriate tests.
>>
>> The current patch optimizes frequently used shuffle patterns and gives
>> these instruction sequence reduction.
>> Before:
>>       vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
>>        vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
>>        vextractf128    $1, %ymm1, %xmm1
>>        vextractf128    $1, %ymm0, %xmm0
>>        vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
>>        vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
>>        vinsertf128     $1, %xmm0, %ymm2, %ymm0
>> After:
>>       vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 =
>> ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
>>       vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
>>       vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 =
>> ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
>>
>>  Thank you
>>
>> - Elena
>>
>>
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for the
>> sole use of the intended recipient(s). Any review or distribution by others
>> is strictly prohibited. If you are not the intended recipient, please
>> contact the sender and delete all copies.
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>> ****
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits****
>>
>>
>>
>>
>> --
>> ~Craig****
>>  ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>
>
>
> --
> ~Craig
>



-- 
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120625/90bafd4c/attachment.html>


More information about the llvm-commits mailing list