[llvm-bugs] [Bug 26931] New: x87 single precision float underflow hidden (by incorrect double precision store?)

Sun Mar 13 16:23:17 PDT 2016

https://llvm.org/bugs/show_bug.cgi?id=26931

            Bug ID: 26931
           Summary: x87 single precision float underflow hidden (by
                    incorrect double precision store?)
           Product: new-bugs
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: dimitry at andric.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

Recently on the FreeBSD toolchain mailing list, Steve Kargl posted a
test case [1] which shows that clang appears to mask, or hide, x87
single precision float underflows.

The test case is simple enough to post it directly in this description:

#include <fenv.h>
#include <stdio.h>

__attribute__((__noinline__))
float foo()
{
  static const volatile float tiny = 1.e-30f;
  return (tiny * tiny);
}

int main(void)
{
  float x;
  feclearexcept(FE_ALL_EXCEPT);
  x = foo();
  if (fetestexcept(FE_UNDERFLOW))
    printf("FE_UNDERFLOW: ");
  printf("x = %e\n", x);
  return 0;
}

So when the multiplication is executed, it underflows a single precision
float, and this should generate an FE_UNDERFLOW.  With gcc (I tested
5.3.0) this works as expected.

But with clang trunk r263389, what appears to happen is that it either
stores the FP status word directly after an fmul (which does *not*
generate an underflow, since it stores the result in a double precision
register), or it uses fstpl to store the result as double precision,
again not generating an underflow.

E.g. with clang -m32 -O2 -S, the following assembly is the result:

foo:
        flds    foo.tiny
        fmuls   foo.tiny
        retl
[...]
main:
[...]
        calll   feclearexcept
        calll   foo
        fstpl   20(%esp)                # 8-byte Folded Spill
        movl    $16, (%esp)
        calll   fetestexcept
        testl   %eax, %eax

So while the fmuls might generate an underflow, the result from the call
to foo() is then stored using fstpl, which also affects the FP status
register, thus probably clearing the underflow.  It appears as if clang
is (incorrectly?) using an intermediate value with greater precision.

In contrast, with gcc -m32 -O2 -S, the following assembly results:

foo:
        flds    tiny.2247
        flds    tiny.2247
        fmulp   %st, %st(1)
        ret
[...]
main:
[...]
        call    feclearexcept
        call    foo
        movl    $16, (%esp)
        fstps   12(%esp)
        call    fetestexcept
        testl   %eax, %eax

E.g. here the result of foo() is stored using fstps, which generates an
underflow.  This can then be detected through fetestexcept().

The outcome of this sample is clearly incorrect in clang's case, and it
seems it should store the function result in a single precision floating
point value, not a double precision one.

[1]
https://lists.freebsd.org/pipermail/freebsd-toolchain/2016-March/002077.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160313/04322696/attachment.html>