<html>

    <head>

      <base href="https://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - x87 single precision float underflow hidden (by incorrect double precision store?)"

   href="https://llvm.org/bugs/show_bug.cgi?id=26931">26931</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>x87 single precision float underflow hidden (by incorrect double precision store?)

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>new-bugs

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>new bugs

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>dimitry@andric.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Recently on the FreeBSD toolchain mailing list, Steve Kargl posted a

test case [1] which shows that clang appears to mask, or hide, x87

single precision float underflows.

The test case is simple enough to post it directly in this description:

#include <fenv.h>

#include <stdio.h>

__attribute__((__noinline__))

float foo()

{

  static const volatile float tiny = 1.e-30f;

  return (tiny * tiny);

}

int main(void)

{

  float x;

  feclearexcept(FE_ALL_EXCEPT);

  x = foo();

  if (fetestexcept(FE_UNDERFLOW))

    printf("FE_UNDERFLOW: ");

  printf("x = %e\n", x);

  return 0;

}

So when the multiplication is executed, it underflows a single precision

float, and this should generate an FE_UNDERFLOW.  With gcc (I tested

5.3.0) this works as expected.

But with clang trunk r263389, what appears to happen is that it either

stores the FP status word directly after an fmul (which does *not*

generate an underflow, since it stores the result in a double precision

register), or it uses fstpl to store the result as double precision,

again not generating an underflow.

E.g. with clang -m32 -O2 -S, the following assembly is the result:

foo:

        flds    foo.tiny

        fmuls   foo.tiny

        retl

[...]

main:

[...]

        calll   feclearexcept

        calll   foo

        fstpl   20(%esp)                # 8-byte Folded Spill

        movl    $16, (%esp)

        calll   fetestexcept

        testl   %eax, %eax

So while the fmuls might generate an underflow, the result from the call

to foo() is then stored using fstpl, which also affects the FP status

register, thus probably clearing the underflow.  It appears as if clang

is (incorrectly?) using an intermediate value with greater precision.

In contrast, with gcc -m32 -O2 -S, the following assembly results:

foo:

        flds    tiny.2247

        flds    tiny.2247

        fmulp   %st, %st(1)

        ret

[...]

main:

[...]

        call    feclearexcept

        call    foo

        movl    $16, (%esp)

        fstps   12(%esp)

        call    fetestexcept

        testl   %eax, %eax

E.g. here the result of foo() is stored using fstps, which generates an

underflow.  This can then be detected through fetestexcept().

The outcome of this sample is clearly incorrect in clang's case, and it

seems it should store the function result in a single precision floating

point value, not a double precision one.

[1]

<a href="https://lists.freebsd.org/pipermail/freebsd-toolchain/2016-March/002077.html">https://lists.freebsd.org/pipermail/freebsd-toolchain/2016-March/002077.html</a></pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>