[llvm-commits] Review Request: VPERM optimization for AVX2

Demikhovsky, Elena elena.demikhovsky at intel.com
Wed Apr 11 12:12:21 PDT 2012


Craig, thank you for the review.
CL means cross-lane.

- Elena
From: Craig Topper [mailto:craig.topper at gmail.com]
Sent: Wednesday, April 11, 2012 08:52
To: Demikhovsky, Elena
Cc: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] Review Request: VPERM optimization for AVX2

Functionality wise, this looks fine. Some style comments below.


This comment is stale

+  // Handle 128 and 256-bit vector lengths. AVX defines PSHUF/SHUFP to operate
+  // independently on 128-bit lanes.
+  unsigned NumElts = VT.getVectorNumElements();


80 columns

+    SDValue res = DAG.getNode(VT.isInteger()? X86ISD::VPERMD : X86ISD::VPERMPS, dl, VT,
+      DAG.getNode(ISD::BUILD_VECTOR, dl, MVT::v8i32, &permclMask[0], 8), V1);
+    return res;

+  }
+  if (V2IsUndef && HasAVX2 && (VT == MVT::v4i64 || VT == MVT::v4f64)) {
+    return getTargetShuffleNode(VT.isInteger()? X86ISD::VPERMQ : X86ISD::VPERMPD, dl, VT, V1,
+                                getShuffleCLImmediate(SVOp), DAG);
+  }


Dangling space after the else. What is the 0x80 value? The instruction only uses the lower 3-bits of each value. Probably cleaner codewise to make the value part conditional and not repeat the the push_back and getConstant calls twice.

+      if (M[i] < 0)
+        permclMask.push_back(DAG.getConstant(0x80, MVT::i32));
+      else
+        permclMask.push_back(DAG.getConstant(M[i], MVT::i32));
+    }


What does the "CL" here stand for?

+/// getShuffleCLImmediate - Return the appropriate immediate to shuffle
+/// the specified VECTOR_SHUFFLE mask with VPERMQ and VPERMPD instructions.
+/// Handles 256-bit.
+static unsigned getShuffleCLImmediate(ShuffleVectorSDNode *N) {


Either align the SDTShuff2Opl part across all 5 rows or remove the extra spaces from VPERMD/VPERMPS/VPERMQ

 def X86VPermilp  : SDNode<"X86ISD::VPERMILP", SDTShuff2OpI>;
+def X86VPermd    : SDNode<"X86ISD::VPERMD",     SDTShuff2Op>;
+def X86VPermps   : SDNode<"X86ISD::VPERMPS",    SDTShuff2Op>;
+def X86VPermq    : SDNode<"X86ISD::VPERMQ",  SDTShuff2OpI>;
+def X86VPermpd   : SDNode<"X86ISD::VPERMPD", SDTShuff2OpI>;

~Craig
On Tue, Apr 10, 2012 at 4:22 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at intel.com>> wrote:
I added VPERMQ/VPERMD/VPERMPD/VPERMPS patterns. Please review.


- Elena


---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu<mailto:llvm-commits at cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits



--
~Craig
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120411/fe857993/attachment.html>


More information about the llvm-commits mailing list