[PATCH] D20598: [X86] Detect SAD patterns and emit psadbw instructions on X86 redux

Wei Mi via llvm-commits llvm-commits at lists.llvm.org
Thu May 26 09:52:18 PDT 2016


wmi added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:29475
@@ +29474,3 @@
+    // just take the low part of the sad without losing any elements.
+    if (VT.getSizeInBits() < ResVT.getSizeInBits())
+      Sad = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Sad,
----------------
The condition can be represented using NumConcat == 0

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:29475
@@ +29474,3 @@
+    // just take the low part of the sad without losing any elements.
+    if (VT.getSizeInBits() < ResVT.getSizeInBits())
+      Sad = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Sad,
----------------
wmi wrote:
> The condition can be represented using NumConcat == 0
Can be represented as NumConcat == 0?

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:29476
@@ +29475,3 @@
+    if (VT.getSizeInBits() < ResVT.getSizeInBits())
+      Sad = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Sad,
+                        DAG.getIntPtrConstant(0, DL));
----------------
The EXTRACT_SUBVECTOR generated a useless pshufd for the sad_2i8 test.

================
Comment at: test/CodeGen/X86/sad.ll:990
@@ +989,3 @@
+; SSE2-NEXT:    psadbw %xmm3, %xmm2
+; SSE2-NEXT:    pshufd {{.*#+}} xmm2 = xmm2[0,1,1,3]
+; SSE2-NEXT:    paddq %xmm2, %xmm0
----------------
The pshufd is generated because of the EXTRACT_SUBVECTOR to take the lower part of the sad result. however, since we know the sad result except the lower 16 bits are all 0, the pshufd is useless.


http://reviews.llvm.org/D20598





More information about the llvm-commits mailing list