[PATCH] D31679: Support PMADDWD and PMADDUBSW

Tue Apr 4 17:00:44 PDT 2017

danielcdh added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:34605-34607
+  // SSSE3 has 8bit PMADDUBSW support, otherwise use 16bit PMADDWD
+  if (!Subtarget.hasSSSE3() && (Mode == MULS8 || Mode == MULU8))
+    Mode = (Mode == MULS8) ? MULS16 : MULU16;
----------------
wmi wrote:
> Just realize maybe we need to prove no wrap before use PMADDUBSW? It is possible that i8*i8+i8*i8 will overflow for i16 and PMADDUBSW will generate saturate result. However, the original i32 operations will not have overflow issue.
Good catch!

Looks like we should not use compiler to generate PMADDUBSW directly. Patch updated.

https://reviews.llvm.org/D31679