[PATCH] D31679: Support PMADDWD and PMADDUBSW

Dehao Chen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 4 17:00:44 PDT 2017


danielcdh added inline comments.


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:34605-34607
+  // SSSE3 has 8bit PMADDUBSW support, otherwise use 16bit PMADDWD
+  if (!Subtarget.hasSSSE3() && (Mode == MULS8 || Mode == MULU8))
+    Mode = (Mode == MULS8) ? MULS16 : MULU16;
----------------
wmi wrote:
> Just realize maybe we need to prove no wrap before use PMADDUBSW? It is possible that i8*i8+i8*i8 will overflow for i16 and PMADDUBSW will generate saturate result. However, the original i32 operations will not have overflow issue.
Good catch!

Looks like we should not use compiler to generate PMADDUBSW directly. Patch updated.


https://reviews.llvm.org/D31679





More information about the llvm-commits mailing list