<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - AArch64 does not implement FLT_ROUNDS"
   href="https://llvm.org/bugs/show_bug.cgi?id=25191">25191</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>AArch64 does not implement FLT_ROUNDS
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: AArch64
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>ed@80386.nl
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>The C99 FLT_ROUNDS macro is supposed to return the current rounding mode of the
processor's floating point unit. Clang's __builtin_flt_rounds() can be used to
implement the FLT_ROUNDS macro.

Consider the following piece of code:

int foo(void) {
  return __builtin_flt_rounds();
}

For x86-64 we build the following piece of code:

0000000000000000 <foo>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   d9 7d f0                fnstcw -0x10(%rbp)
   7:   0f b7 45 f0             movzwl -0x10(%rbp),%eax
   b:   89 c1                   mov    %eax,%ecx
   d:   81 e1 00 04 00 00       and    $0x400,%ecx
  13:   c1 e9 09                shr    $0x9,%ecx
  16:   25 00 08 00 00          and    $0x800,%eax
  1b:   c1 e8 0b                shr    $0xb,%eax
  1e:   09 c8                   or     %ecx,%eax
  20:   ff c0                   inc    %eax
  22:   83 e0 03                and    $0x3,%eax
  25:   5d                      pop    %rbp
  26:   c3                      retq   

Which is correct. Now if we build exactly the same code with Clang targeted
against aarch64, we obtain:

0000000000000000 <foo>:
   0:   320003e0        orr     w0, wzr, #0x1
   4:   d65f03c0        ret

This is obviously not correct. It should return a value based on bits 22:23 of
FPCR.

I tried patching up Clang myself to support this, but I have to confess that I
don't know what I'm doing. I started out copying
ARMTargetLowering::LowerFLT_ROUNDS_() to
AArch64TargetLowering::LowerFLT_ROUNDS_(), but then I have no idea how to fetch
the FPCR register.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>