[PATCH] D31679: Support PMADDWD and PMADDUBSW
Wei Mi via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 4 15:46:56 PDT 2017
wmi added inline comments.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:34588-34599
+ SDValue Op0 = N->getOperand(0);
+ SDValue Op1 = N->getOperand(1);
+
+ SDValue MulOp, Phi;
+ if (Op0.getOpcode() == ISD::MUL) {
+ MulOp = Op0;
+ Phi = Op1;
----------------
%%% Maybe use std::swap, so that Op0 and Op1 are unnecessary.
MulOp = N->getOperand(0);
Phi = N->getOperand(1);
if (MulOp.getOpcode() != ISD::MUL) {
std::swap(MulOp, Phi);
if (MulOp.getOpcode() != ISD::MUL)
return SDValue();
}%%%
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:34605-34607
+ // SSSE3 has 8bit PMADDUBSW support, otherwise use 16bit PMADDWD
+ if (!Subtarget.hasSSSE3() && (Mode == MULS8 || Mode == MULU8))
+ Mode = (Mode == MULS8) ? MULS16 : MULU16;
----------------
Just realize maybe we need to prove no wrap before use PMADDUBSW? It is possible that i8*i8+i8*i8 will overflow for i16 and PMADDUBSW will generate saturate result. However, the original i32 operations will not have overflow issue.
https://reviews.llvm.org/D31679
More information about the llvm-commits
mailing list