[PATCH] D46498: [X86] Enable reciprocal estimates for v16f32 vectors by using VRCP14PS/VRSQRT14PS

Sat May 5 16:16:34 PDT 2018

craig.topper created this revision.
craig.topper added a reviewer: spatel.
Herald added a subscriber: mehdi_amini.

The legacy VRCPPS/VRSQRTPS instructions aren't available in 512-bit versions. The new increased precision versions are. So we can use those to implement v16f32 reciprocal estimates.

For KNL CPUs we can probably use VRCP28PS/VRSQRT28PS and avoid the NR step altogether, but I leave that for a future patch.

https://reviews.llvm.org/D46498

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/recip-fastmath.ll
  test/CodeGen/X86/recip-fastmath2.ll
  test/CodeGen/X86/sqrt-fastmath.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D46498.145388.patch
Type: text/x-patch
Size: 14745 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180505/884769f1/attachment.bin>