[PATCH] D14840: [X86] Detect SAD patterns and emit psadbw instructions on X86.
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 8 03:17:42 PST 2016
RKSimon added a comment.
The checks on the tests are quite poor - I understand that update_llc_test_checks.py output might be too much but please see if you can expand the checks to give a better idea of context.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:28751-28763
@@ -28750,1 +28750,15 @@
+static SDValue detectSADPattern(SDNode *N, SelectionDAG &DAG,
+ const X86Subtarget &Subtarget) {
+ SDLoc DL(N);
+ EVT VT = N->getValueType(0);
+ SDValue Op0 = N->getOperand(0);
+ SDValue Op1 = N->getOperand(1);
+
+ if (!VT.isVector() || !VT.isSimple() ||
+ !(VT.getVectorElementType() == MVT::i32))
+ return SDValue();
+
+ unsigned RegSize = 128;
+ if (Subtarget.hasAVX512())
+ RegSize = 512;
----------------
512-bit PSAD requires AVX512BW (Subtarget.hasBWI()) - plain AVX512 can't handle 512-bit vector elements smaller than 32-bits - please ensure the tests run with avx512f (which will use the AVX2 path) as well as avx512bw.
================
Comment at: test/CodeGen/X86/sad.ll:2-7
@@ +1,8 @@
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+sse2 -force-target-max-vector-interleave=1 -unroll-count=1 | llc | FileCheck %s --check-prefix=SSE2
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx2 -force-target-max-vector-interleave=1 -unroll-count=1 | llc | FileCheck %s --check-prefix=AVX2
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx512bw -force-target-max-vector-interleave=1 -unroll-count=1 | llc | FileCheck %s --check-prefix=AVX512BW
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+sse2 -force-target-max-vector-interleave=1 | llc | FileCheck %s --check-prefix=SSE2-UNROLL
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx2 -force-target-max-vector-interleave=1 | llc | FileCheck %s --check-prefix=AVX2-UNROLL
+; RUN: opt < %s -O2 -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -mattr=+avx512bw -force-target-max-vector-interleave=1 | llc | FileCheck %s --check-prefix=AVX512BW-UNROLL
+
+ at a = global [1024 x i8] zeroinitializer, align 16
----------------
congh wrote:
> ab wrote:
> > Does it need to be huge? After all, this is only testing the PSAD ISel, and the pattern seems pretty minimal; would it work to test variants of your example in detectSADPattern?
> I need to compose the test so that the reduction vector operations can be detected. But I think I can still write small tests.
>
> To test on different targets, I have to split the test files into threes for testing sse2/avx2/avx512 as I found llc cannot parse opt -mattr=+avx512bw generated IR on AVX2.
This sounds like a bug - the test case has no target specific intrinsics so we should be able to handle it on every target.
http://reviews.llvm.org/D14840
More information about the llvm-commits
mailing list