[PATCH] D14840: [X86] Detect SAD patterns and emit psadbw instructions on X86.

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 8 03:17:42 PST 2016


RKSimon added a comment.

The checks on the tests are quite poor - I understand that update_llc_test_checks.py output might be too much but please see if you can expand the checks to give a better idea of context.


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:28751-28763
@@ -28750,1 +28750,15 @@
 
+static SDValue detectSADPattern(SDNode *N, SelectionDAG &DAG,
+                                const X86Subtarget &Subtarget) {
+  SDLoc DL(N);
+  EVT VT = N->getValueType(0);
+  SDValue Op0 = N->getOperand(0);
+  SDValue Op1 = N->getOperand(1);
+
+  if (!VT.isVector() || !VT.isSimple() ||
+      !(VT.getVectorElementType() == MVT::i32))
+    return SDValue();
+
+  unsigned RegSize = 128;
+  if (Subtarget.hasAVX512())
+    RegSize = 512;
----------------
512-bit PSAD requires AVX512BW (Subtarget.hasBWI()) - plain AVX512 can't handle 512-bit vector elements smaller than 32-bits - please ensure the tests run with avx512f (which will use the AVX2 path) as well as avx512bw.

================
Comment at: test/CodeGen/X86/sad.ll:2-7
@@ +1,8 @@
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+sse2 -force-target-max-vector-interleave=1 -unroll-count=1 | llc | FileCheck %s --check-prefix=SSE2
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx2 -force-target-max-vector-interleave=1 -unroll-count=1 | llc | FileCheck %s --check-prefix=AVX2
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx512bw -force-target-max-vector-interleave=1 -unroll-count=1 | llc | FileCheck %s --check-prefix=AVX512BW
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+sse2 -force-target-max-vector-interleave=1 | llc | FileCheck %s --check-prefix=SSE2-UNROLL
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx2 -force-target-max-vector-interleave=1 | llc | FileCheck %s --check-prefix=AVX2-UNROLL
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx512bw -force-target-max-vector-interleave=1 | llc | FileCheck %s --check-prefix=AVX512BW-UNROLL
+
+ at a = global [1024 x i8] zeroinitializer, align 16
----------------
congh wrote:
> ab wrote:
> > Does it need to be huge? After all, this is only testing the PSAD ISel, and the pattern seems pretty minimal; would it work to test variants of your example in detectSADPattern?
> I need to compose the test so that the reduction vector operations can be detected. But I think I can still write small tests.
> 
> To test on different targets, I have to split the test files into threes for testing sse2/avx2/avx512 as I found llc cannot parse opt -mattr=+avx512bw generated IR on AVX2. 
This sounds like a bug - the test case has no target specific intrinsics so we should be able to handle it on every target.


http://reviews.llvm.org/D14840





More information about the llvm-commits mailing list