[llvm-bugs] [Bug 34994] New: sqrt(denormal float) gives -infinity with fast-math
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Oct 18 12:34:23 PDT 2017
https://bugs.llvm.org/show_bug.cgi?id=34994
Bug ID: 34994
Summary: sqrt(denormal float) gives -infinity with fast-math
Product: libraries
Version: trunk
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: newtonallen3 at gmail.com
CC: llvm-bugs at lists.llvm.org
Created attachment 19314
--> https://bugs.llvm.org/attachment.cgi?id=19314&action=edit
repro case
$ cat sqrt_denormal_fastmath.cc
#include <cmath>
#include <iostream>
__attribute__((noinline)) void print_sqrt(float val) {
const float root = sqrt(val);
std::cout << "sqrt(" << val << ") = " << root << std::endl;
}
int main(int argc, char** argv) {
print_sqrt(1e-34);
print_sqrt(1e-36);
print_sqrt(1e-38);
print_sqrt(1e-40);
print_sqrt(1e-42);
print_sqrt(1e-44);
print_sqrt(1e-46);
}
$ clang sqrt_denormal_fastmath.cc -O2 -std=c++11 -ffast-math
$ ./a.out
sqrt(1e-34) = 1e-17
sqrt(1e-36) = 1e-18
sqrt(1e-38) = -inf
sqrt(9.99995e-41) = -inf
sqrt(1.00053e-42) = -inf
sqrt(9.80909e-45) = -inf
sqrt(0) = 0
The computed square root is correct for normalized floats and for zero, but is
completely wrong for denormal floats (negative infinity).
The square root for denormal floats should either be approximately correct, or
perhaps just rounded to zero.
The problem here is that sqrt is computed in fast-math mode on x86 using the
reciprocal square root instruction (rsqrtss), which returns infinity for an
input value of zero *or* any denormal float. The instructions after rsqrtss fix
up the input=zero case, but don't handle the input=denormal case.
Here's the generated assembly for the sqrt instruction above
(https://godbolt.org/g/hvbKPQ)
rsqrtss xmm3, xmm0
movaps xmm1, xmm0
movaps xmm4, xmm0
movss dword ptr [rsp + 12], xmm4 # 4-byte Spill
mulss xmm1, xmm3
movss xmm2, dword ptr [rip + .LCPI0_0] # xmm2 = mem[0],zero,zero,zero
mulss xmm2, xmm1
mulss xmm1, xmm3
addss xmm1, dword ptr [rip + .LCPI0_1]
mulss xmm1, xmm2
xorps xmm0, xmm0
cmpeqss xmm0, xmm4
andnps xmm0, xmm1
The last three instructions (xorps, cmpeqss, andnps) check if the input was
zero and set the output to zero if so. However, there's no check for whether
the input was denormal.
Relevant code:
- lib/Target/X86/X86ISelLowering.cpp: X86TargetLowering::getSqrtEstimate()
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171018/06171160/attachment.html>
More information about the llvm-bugs
mailing list