[PATCH] Optimize sext 4xi8 to 4xi64

Muhammad Tauqir Ahmad muhammad.t.ahmad at intel.com
Mon Mar 4 12:10:23 PST 2013


Optimize sext 4xi8 to 4xi64.

This produces nicer code for sext v4i8 -> v4i64 by generating vpmovsxbd instead of shift-left + shift-right pair.

Test included.

http://llvm-reviews.chandlerc.com/D491

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/avx-sext.ll

Index: lib/Target/X86/X86ISelLowering.cpp
===================================================================
--- lib/Target/X86/X86ISelLowering.cpp
+++ lib/Target/X86/X86ISelLowering.cpp
@@ -11812,8 +11812,23 @@
       // fall through
     case MVT::v4i32:
     case MVT::v8i16: {
-      SDValue Tmp1 = getTargetVShiftNode(X86ISD::VSHLI, dl, VT,
-                                         Op.getOperand(0), ShAmt, DAG);
+      // (sext (vzext x)) -> (vsext x)
+      SDValue Op0 = Op.getOperand(0);
+      SDValue Op00 = Op0.getOperand(0);
+      SDValue Tmp1;
+      // Hopefully, this VECTOR_SHUFFLE is just a VZEXT.
+      if (Op0.getOpcode() == ISD::BITCAST &&
+          Op00.getOpcode() == ISD::VECTOR_SHUFFLE)
+        Tmp1 = LowerVectorIntExtend(Op00, DAG);
+      if (Tmp1.getNode()) {
+        SDValue Tmp1Op0 = Tmp1.getOperand(0);
+        assert(Tmp1Op0.getOpcode() == X86ISD::VZEXT &&
+               "This optimization is invalid without a VZEXT.");
+        return DAG.getNode(X86ISD::VSEXT, dl, VT, Tmp1Op0.getOperand(0));
+      }
+
+      // If the above didn't work, then just use Shift-Left + Shift-Right.
+      Tmp1 = getTargetVShiftNode(X86ISD::VSHLI, dl, VT, Op0, ShAmt, DAG);
       return getTargetVShiftNode(X86ISD::VSRAI, dl, VT, Tmp1, ShAmt, DAG);
     }
   }
Index: test/CodeGen/X86/avx-sext.ll
===================================================================
--- test/CodeGen/X86/avx-sext.ll
+++ test/CodeGen/X86/avx-sext.ll
@@ -165,3 +165,13 @@
   ret <4 x i64> %extmask
 }
 
+; AVX: sext_4i8_to_4i64
+; AVX: vpmovsxbd
+; AVX: vpmovsxdq
+; AVX: vpmovsxdq
+; AVX: ret
+define <4 x i64> @load_sext_4i8_to_4i64(<4 x i8> *%ptr) {
+ %X = load <4 x i8>* %ptr
+ %Y = sext <4 x i8> %X to <4 x i64>
+ ret <4 x i64>%Y
+}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D491.1.patch
Type: text/x-patch
Size: 1746 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130304/73808e56/attachment.bin>


More information about the llvm-commits mailing list