<html>

    <head>

      <base href="https://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - AArch64 does not implement FLT_ROUNDS"

   href="https://llvm.org/bugs/show_bug.cgi?id=25191">25191</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>AArch64 does not implement FLT_ROUNDS

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Backend: AArch64

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>ed@80386.nl

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>The C99 FLT_ROUNDS macro is supposed to return the current rounding mode of the

processor's floating point unit. Clang's __builtin_flt_rounds() can be used to

implement the FLT_ROUNDS macro.

Consider the following piece of code:

int foo(void) {

  return __builtin_flt_rounds();

}

For x86-64 we build the following piece of code:

0000000000000000 <foo>:

   0:   55                      push   %rbp

   1:   48 89 e5                mov    %rsp,%rbp

   4:   d9 7d f0                fnstcw -0x10(%rbp)

   7:   0f b7 45 f0             movzwl -0x10(%rbp),%eax

   b:   89 c1                   mov    %eax,%ecx

   d:   81 e1 00 04 00 00       and    $0x400,%ecx

  13:   c1 e9 09                shr    $0x9,%ecx

  16:   25 00 08 00 00          and    $0x800,%eax

  1b:   c1 e8 0b                shr    $0xb,%eax

  1e:   09 c8                   or     %ecx,%eax

  20:   ff c0                   inc    %eax

  22:   83 e0 03                and    $0x3,%eax

  25:   5d                      pop    %rbp

  26:   c3                      retq   

Which is correct. Now if we build exactly the same code with Clang targeted

against aarch64, we obtain:

0000000000000000 <foo>:

   0:   320003e0        orr     w0, wzr, #0x1

   4:   d65f03c0        ret

This is obviously not correct. It should return a value based on bits 22:23 of

FPCR.

I tried patching up Clang myself to support this, but I have to confess that I

don't know what I'm doing. I started out copying

ARMTargetLowering::LowerFLT_ROUNDS_() to

AArch64TargetLowering::LowerFLT_ROUNDS_(), but then I have no idea how to fetch

the FPCR register.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>