[PATCH] D42258: [X86][SSE] Aggressively use PMADDWD for v4i32 multiplies with 17 or more leading zeros

Thu Jan 18 11:44:25 PST 2018

RKSimon created this revision.
RKSimon added reviewers: craig.topper, pcordes, zvi, andreadb, spatel.

As discussed in https://reviews.llvm.org/D41484, PMADDWD for 'zero extended' vXi32 is nearly always a better option than PMULLD:
On SNB it will result in code that isn't any faster, but not any slower so we may as well keep it.
On KNL it only has half the throughput, so I've disabled it on there - ideally there'd be a better way than this.

Repository:
  rL LLVM

https://reviews.llvm.org/D42258

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/promote.ll
  test/CodeGen/X86/shrink_vmul.ll
  test/CodeGen/X86/slow-pmulld.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D42258.130468.patch
Type: text/x-patch
Size: 42745 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180118/173ef298/attachment-0001.bin>