[PATCH] D20598: [X86] Detect SAD patterns and emit psadbw instructions on X86 redux

Michael Kuperstein via llvm-commits llvm-commits at lists.llvm.org
Thu May 26 10:43:44 PDT 2016

mkuper added a comment.

Thanks, Wei!

Comment at: lib/Target/X86/X86ISelLowering.cpp:29475
@@ +29474,3 @@
+    // just take the low part of the sad without losing any elements.
+    if (VT.getSizeInBits() < ResVT.getSizeInBits())
+      Sad = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Sad,
wmi wrote:
> wmi wrote:
> > The condition can be represented using NumConcat == 0
> Can be represented as NumConcat == 0?
Right, I thought it was clearer this way. I can change it to NumConcat == 0 if you prefer. But, to be honest, I'd rather actually get rid of NumConcat in the condition above, and make that an explicit comparison as well.
What do you think?

Comment at: test/CodeGen/X86/sad.ll:990
@@ +989,3 @@
+; SSE2-NEXT:    psadbw %xmm3, %xmm2
+; SSE2-NEXT:    pshufd {{.*#+}} xmm2 = xmm2[0,1,1,3]
+; SSE2-NEXT:    paddq %xmm2, %xmm0
wmi wrote:
> The pshufd is generated because of the EXTRACT_SUBVECTOR to take the lower part of the sad result. however, since we know the sad result except the lower 16 bits are all 0, the pshufd is useless.
Right, because of the way we legalize v2i32, the extract becomes an anyext instead of a nop.
I'll see if I can get rid of it. Thanks!


More information about the llvm-commits mailing list