[PATCH] D31679: Support PMADDWD and PMADDUBSW
Dehao Chen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 4 17:00:44 PDT 2017
danielcdh added inline comments.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:34605-34607
+ // SSSE3 has 8bit PMADDUBSW support, otherwise use 16bit PMADDWD
+ if (!Subtarget.hasSSSE3() && (Mode == MULS8 || Mode == MULU8))
+ Mode = (Mode == MULS8) ? MULS16 : MULU16;
----------------
wmi wrote:
> Just realize maybe we need to prove no wrap before use PMADDUBSW? It is possible that i8*i8+i8*i8 will overflow for i16 and PMADDUBSW will generate saturate result. However, the original i32 operations will not have overflow issue.
Good catch!
Looks like we should not use compiler to generate PMADDUBSW directly. Patch updated.
https://reviews.llvm.org/D31679
More information about the llvm-commits
mailing list