[PATCH] D60037: [PowerPC] Use the two-constant NR algorithm for refining estimates
Nemanja Ivanovic via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Mar 30 16:40:41 PDT 2019
nemanjai created this revision.
nemanjai added reviewers: hfinkel, renenkel, jsji, stefanp.
Herald added subscribers: jdoerfert, kbarton.
Herald added a project: LLVM.
The single-constant algorithm produces infinities on a lot of denormal values. The precision of the two-constant algorithm is actually sufficient across the range of denormals. We will switch to that algorithm for now to avoid the infinities on denormals. In the future, we will re-evaluate the algorithm to find the optimal one for PowerPC.
Example:
$ cat a.c
#include <stdio.h>
#include <math.h>
float __attribute__((noinline)) test(float f) { return sqrtf(f); }
int main(void) {
return printf("sqrt(0.49e-43): %g\n", test(0.49e-43));
}
$ clang -Ofast a.c
$ ./a.out
sqrt(0.49e-43): -inf
Desired output (and output with this patch applied):
$ ./a.out
sqrt(0.49e-43): -inf
We have also run this through a reasonable approximation of the gamut of tests (1,000,000 tests per exponent over the full single-precision range vs. the precise HW instruction). Here are the results from this test (courtesy of @renenkel):
0 ulps: 72 %
1 ulps: 27
2 ulps: 0.032
3 ulps: 0
>3 ulps: 0.35
max error = 2 ulps over full range
except returns NaN for +Inf
Repository:
rL LLVM
https://reviews.llvm.org/D60037
Files:
lib/Target/PowerPC/PPC.td
lib/Target/PowerPC/PPCISelLowering.cpp
lib/Target/PowerPC/PPCSubtarget.h
test/CodeGen/PowerPC/fma-mutate.ll
test/CodeGen/PowerPC/fmf-propagation.ll
test/CodeGen/PowerPC/recipest.ll
test/CodeGen/PowerPC/vsx-fma-mutate-trivial-copy.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D60037.192994.patch
Type: text/x-patch
Size: 8827 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190330/a49d0ca9/attachment.bin>
More information about the llvm-commits
mailing list