[PATCH] D60037: [PowerPC] Use the two-constant NR algorithm for refining estimates

Sat Mar 30 16:40:41 PDT 2019

nemanjai created this revision.
nemanjai added reviewers: hfinkel, renenkel, jsji, stefanp.
Herald added subscribers: jdoerfert, kbarton.
Herald added a project: LLVM.

The single-constant algorithm produces infinities on a lot of denormal values. The precision of the two-constant algorithm is actually sufficient across the range of denormals. We will switch to that algorithm for now to avoid the infinities on denormals. In the future, we will re-evaluate the algorithm to find the optimal one for PowerPC.

Example:

  $ cat a.c 
  #include <stdio.h>
  #include <math.h>
  float __attribute__((noinline)) test(float f) { return sqrtf(f); }
  int main(void) {
    return printf("sqrt(0.49e-43): %g\n", test(0.49e-43));
  }

  $ clang -Ofast a.c
  $ ./a.out 
  sqrt(0.49e-43): -inf

Desired output (and output with this patch applied):

  $ ./a.out 
  sqrt(0.49e-43): -inf

We have also run this through a reasonable approximation of the gamut of tests (1,000,000 tests per exponent over the full single-precision range vs. the precise HW instruction). Here are the results from this test (courtesy of @renenkel):

   0 ulps:  72 %
    1 ulps:  27
    2 ulps:  0.032
    3 ulps:  0
   >3 ulps:  0.35

  max error = 2 ulps over full range
  except returns NaN for +Inf

Repository:
  rL LLVM

https://reviews.llvm.org/D60037

Files:
  lib/Target/PowerPC/PPC.td
  lib/Target/PowerPC/PPCISelLowering.cpp
  lib/Target/PowerPC/PPCSubtarget.h
  test/CodeGen/PowerPC/fma-mutate.ll
  test/CodeGen/PowerPC/fmf-propagation.ll
  test/CodeGen/PowerPC/recipest.ll
  test/CodeGen/PowerPC/vsx-fma-mutate-trivial-copy.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D60037.192994.patch
Type: text/x-patch
Size: 8827 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190330/a49d0ca9/attachment.bin>