[PATCH] D31679: Support PMADDWD and PMADDUBSW

Tue Apr 4 15:46:56 PDT 2017

wmi added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:34588-34599
+  SDValue Op0 = N->getOperand(0);
+  SDValue Op1 = N->getOperand(1);
+
+  SDValue MulOp, Phi;
+  if (Op0.getOpcode() == ISD::MUL) {
+    MulOp = Op0;
+    Phi = Op1;
----------------
%%% Maybe use std::swap, so that Op0 and Op1 are unnecessary.
MulOp = N->getOperand(0);
Phi = N->getOperand(1);
if (MulOp.getOpcode() != ISD::MUL) {
    std::swap(MulOp, Phi);
    if (MulOp.getOpcode() != ISD::MUL)
        return SDValue();
}%%%

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:34605-34607
+  // SSSE3 has 8bit PMADDUBSW support, otherwise use 16bit PMADDWD
+  if (!Subtarget.hasSSSE3() && (Mode == MULS8 || Mode == MULU8))
+    Mode = (Mode == MULS8) ? MULS16 : MULU16;
----------------
Just realize maybe we need to prove no wrap before use PMADDUBSW? It is possible that i8*i8+i8*i8 will overflow for i16 and PMADDUBSW will generate saturate result. However, the original i32 operations will not have overflow issue.

https://reviews.llvm.org/D31679