[PATCH] D46498: [X86] Enable reciprocal estimates for v16f32 vectors by using VRCP14PS/VRSQRT14PS
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat May 5 16:16:34 PDT 2018
craig.topper created this revision.
craig.topper added a reviewer: spatel.
Herald added a subscriber: mehdi_amini.
The legacy VRCPPS/VRSQRTPS instructions aren't available in 512-bit versions. The new increased precision versions are. So we can use those to implement v16f32 reciprocal estimates.
For KNL CPUs we can probably use VRCP28PS/VRSQRT28PS and avoid the NR step altogether, but I leave that for a future patch.
https://reviews.llvm.org/D46498
Files:
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/recip-fastmath.ll
test/CodeGen/X86/recip-fastmath2.ll
test/CodeGen/X86/sqrt-fastmath.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D46498.145388.patch
Type: text/x-patch
Size: 14745 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180505/884769f1/attachment.bin>
More information about the llvm-commits
mailing list