[llvm-bugs] [Bug 34994] New: sqrt(denormal float) gives -infinity with fast-math

Wed Oct 18 12:34:23 PDT 2017

https://bugs.llvm.org/show_bug.cgi?id=34994

            Bug ID: 34994
           Summary: sqrt(denormal float) gives -infinity with fast-math
           Product: libraries
           Version: trunk
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: newtonallen3 at gmail.com
                CC: llvm-bugs at lists.llvm.org

Created attachment 19314
  --> https://bugs.llvm.org/attachment.cgi?id=19314&action=edit
repro case

$ cat sqrt_denormal_fastmath.cc

#include <cmath>
#include <iostream>

__attribute__((noinline)) void print_sqrt(float val) {
  const float root = sqrt(val);
  std::cout << "sqrt(" << val << ") = " << root << std::endl;
}

int main(int argc, char** argv) {
  print_sqrt(1e-34);
  print_sqrt(1e-36);
  print_sqrt(1e-38);
  print_sqrt(1e-40);
  print_sqrt(1e-42);
  print_sqrt(1e-44);
  print_sqrt(1e-46);
}

$ clang sqrt_denormal_fastmath.cc -O2 -std=c++11 -ffast-math
$ ./a.out 
sqrt(1e-34) = 1e-17
sqrt(1e-36) = 1e-18
sqrt(1e-38) = -inf
sqrt(9.99995e-41) = -inf
sqrt(1.00053e-42) = -inf
sqrt(9.80909e-45) = -inf
sqrt(0) = 0

The computed square root is correct for normalized floats and for zero, but is
completely wrong for denormal floats (negative infinity).

The square root for denormal floats should either be approximately correct, or
perhaps just rounded to zero.

The problem here is that sqrt is computed in fast-math mode on x86 using the
reciprocal square root instruction (rsqrtss), which returns infinity for an
input value of zero *or* any denormal float. The instructions after rsqrtss fix
up the input=zero case, but don't handle the input=denormal case.

Here's the generated assembly for the sqrt instruction above
(https://godbolt.org/g/hvbKPQ)

        rsqrtss xmm3, xmm0
        movaps  xmm1, xmm0
        movaps  xmm4, xmm0
        movss   dword ptr [rsp + 12], xmm4 # 4-byte Spill
        mulss   xmm1, xmm3
        movss   xmm2, dword ptr [rip + .LCPI0_0] # xmm2 = mem[0],zero,zero,zero
        mulss   xmm2, xmm1
        mulss   xmm1, xmm3
        addss   xmm1, dword ptr [rip + .LCPI0_1]
        mulss   xmm1, xmm2
        xorps   xmm0, xmm0
        cmpeqss xmm0, xmm4
        andnps  xmm0, xmm1

The last three instructions (xorps, cmpeqss, andnps) check if the input was
zero and set the output to zero if so. However, there's no check for whether
the input was denormal.

Relevant code:
 - lib/Target/X86/X86ISelLowering.cpp: X86TargetLowering::getSqrtEstimate()

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171018/06171160/attachment.html>