[PATCH] D22889: [X86] Match PSADBW in straight-line code

Fri Jul 29 10:09:48 PDT 2016

mkuper added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26450
@@ +26449,3 @@
+  if (!VT.isSimple() || !(VT.getVectorElementType() == MVT::i32))
+    return SDValue();
+
----------------
RKSimon wrote:
> Doesn't PSADBW generate unsigned i16 results? IIRC the horizontal sum of absdiff v16i8 / v32i8 / v64i8 should fit into a single i16 correct? Really there's nothing stopping us supporting any result integer type >= 16-bits  no?
You're right, it's another constraint the original loop code had, and removing it seems trivial. I'll add a TODO and fix this in a separate patch.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26481
@@ +26480,3 @@
+  // <1,u,u,u,u,u,u,u>
+  for (unsigned i = 0; i < Stages; ++i) {
+    if (Op.getOpcode() != ISD::ADD)
----------------
RKSimon wrote:
> This looks like it could be useful for general matching of reduction/horizontal ops - possibly pull it out? I have in mind detecting any_of/all_of tests for vector comparison results that I'd like to use MOVMSK for instead  - in that case it'd be the same code but we'd match against OR / AND instead of ADD.
Sure, I'll pull it out.

TBH, I wouldn't be surprised if we already had something similar, but I didn't see it. There's isHorizontalBinOp(), but it's different - it matches the x86 "horizontal" operations (like PHADDD).

https://reviews.llvm.org/D22889