[PATCH] D8690: [X86][XOP] Added support for the lowering of 128-bit vector shifts to XOP shift instructions

Tue Sep 29 03:19:10 PDT 2015

andreadb accepted this revision.
andreadb added a comment.
This revision is now accepted and ready to land.

Hi Simon,

Nice patch! LGTM.


================
Comment at: lib/Target/X86/X86TargetTransformInfo.cpp:227-234
@@ -168,10 +226,10 @@
     // Vectorizing division is a bad idea. See the SSE2 table for more comments.
-    { ISD::SDIV,  MVT::v32i8,  32*20 },
-    { ISD::SDIV,  MVT::v16i16, 16*20 },
-    { ISD::SDIV,  MVT::v8i32,  8*20 },
-    { ISD::SDIV,  MVT::v4i64,  4*20 },
-    { ISD::UDIV,  MVT::v32i8,  32*20 },
-    { ISD::UDIV,  MVT::v16i16, 16*20 },
-    { ISD::UDIV,  MVT::v8i32,  8*20 },
-    { ISD::UDIV,  MVT::v4i64,  4*20 },
+    { ISD::SDIV,  MVT::v32i8,  32 * 20 },
+    { ISD::SDIV,  MVT::v16i16, 16 * 20 },
+    { ISD::SDIV,  MVT::v8i32,  8 * 20 },
+    { ISD::SDIV,  MVT::v4i64,  4 * 20 },
+    { ISD::UDIV,  MVT::v32i8,  32 * 20 },
+    { ISD::UDIV,  MVT::v16i16, 16 * 20 },
+    { ISD::UDIV,  MVT::v8i32,  8 * 20 },
+    { ISD::UDIV,  MVT::v4i64,  4 * 20 },
   };
----------------
This change is not needed by your patch. I suggest that you just leave this code as it is (there are at least other three places in this same file where we do the same thing). 

================
Comment at: test/Analysis/CostModel/X86/vshift-cost.ll:5-9
@@ -4,5 +4,7 @@
 ; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -mcpu=core-avx2 -cost-model -analyze | FileCheck %s -check-prefix=CHECK -check-prefix=AVX2
+; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -mcpu=bdver2 -cost-model -analyze | FileCheck %s -check-prefix=CHECK -check-prefix=XOP -check-prefix=XOPAVX
+; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -mcpu=bdver4 -cost-model -analyze | FileCheck %s -check-prefix=CHECK -check-prefix=XOP -check-prefix=XOPAVX2
 
 
 ; Verify the cost of vector shift left instructions.
 
----------------
As a TODO, we should add extra test coverage in the cost model for packed arithmetic/logical shifts with variable shift count.  In this file I only see tests for packed logical shift left with variable count.

I think we should extend this file and add extra tests for packed logical/arithmetic shifts and how it changes for AVX2 and XOP. This can be done in a separate patch.


Repository:
  rL LLVM

http://reviews.llvm.org/D8690