[PATCH] D42258: [X86][SSE] Aggressively use PMADDWD for v4i32 multiplies with 17 or more leading zeros
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 18 11:44:25 PST 2018
RKSimon created this revision.
RKSimon added reviewers: craig.topper, pcordes, zvi, andreadb, spatel.
As discussed in https://reviews.llvm.org/D41484, PMADDWD for 'zero extended' vXi32 is nearly always a better option than PMULLD:
On SNB it will result in code that isn't any faster, but not any slower so we may as well keep it.
On KNL it only has half the throughput, so I've disabled it on there - ideally there'd be a better way than this.
Repository:
rL LLVM
https://reviews.llvm.org/D42258
Files:
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/promote.ll
test/CodeGen/X86/shrink_vmul.ll
test/CodeGen/X86/slow-pmulld.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D42258.130468.patch
Type: text/x-patch
Size: 42745 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180118/173ef298/attachment-0001.bin>
More information about the llvm-commits
mailing list