[PATCH] D20598: [X86] Detect SAD patterns and emit psadbw instructions on X86 redux
Wei Mi via llvm-commits
llvm-commits at lists.llvm.org
Thu May 26 09:52:18 PDT 2016
wmi added inline comments.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:29475
@@ +29474,3 @@
+ // just take the low part of the sad without losing any elements.
+ if (VT.getSizeInBits() < ResVT.getSizeInBits())
+ Sad = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Sad,
----------------
The condition can be represented using NumConcat == 0
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:29475
@@ +29474,3 @@
+ // just take the low part of the sad without losing any elements.
+ if (VT.getSizeInBits() < ResVT.getSizeInBits())
+ Sad = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Sad,
----------------
wmi wrote:
> The condition can be represented using NumConcat == 0
Can be represented as NumConcat == 0?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:29476
@@ +29475,3 @@
+ if (VT.getSizeInBits() < ResVT.getSizeInBits())
+ Sad = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Sad,
+ DAG.getIntPtrConstant(0, DL));
----------------
The EXTRACT_SUBVECTOR generated a useless pshufd for the sad_2i8 test.
================
Comment at: test/CodeGen/X86/sad.ll:990
@@ +989,3 @@
+; SSE2-NEXT: psadbw %xmm3, %xmm2
+; SSE2-NEXT: pshufd {{.*#+}} xmm2 = xmm2[0,1,1,3]
+; SSE2-NEXT: paddq %xmm2, %xmm0
----------------
The pshufd is generated because of the EXTRACT_SUBVECTOR to take the lower part of the sad result. however, since we know the sad result except the lower 16 bits are all 0, the pshufd is useless.
http://reviews.llvm.org/D20598
More information about the llvm-commits
mailing list