[PATCH] D41484: [X86][SSE] Use PMADDWD for v4i32 multiplies with 17 or more leading zeros

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 21 04:26:44 PST 2017


RKSimon created this revision.
RKSimon added reviewers: craig.topper, pcordes, zvi, spatel.

If there are 17 or more leading zeros to the v4i32 elements, then we can use PMADD for the integer multiply when PMULLD is unavailable or slow.

The 17 bits need to be zero as the PMADDWD performs a v8i16 signed-mul-extend + pairwise-add - the upper 16 so we're adding a zero pair and the 17th bit so we don't incorrectly sign extend.

If people want I can try to incorporate this more into the ShrinkMode enum returned by canReduceVMulWidth ?


Repository:
  rL LLVM

https://reviews.llvm.org/D41484

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/shrink_vmul.ll
  test/CodeGen/X86/slow-pmulld.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D41484.127859.patch
Type: text/x-patch
Size: 7641 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171221/d160b36a/attachment.bin>


More information about the llvm-commits mailing list