[llvm-dev] clang 4.0.0: Invalid code for builtin floating point function with -mfloat-abi=hard -ffast-math (ARM)

Wed Mar 22 13:42:59 PDT 2017

Am 22.03.2017 um 02:38 schrieb Friedman, Eli:
>
> That is probably unintentional.  Granted, using -mfloat-abi like this is
> kind of weird, but I think clang's behavior is supposed to be
> gcc-compatible.  See https://reviews.llvm.org/rL291909 and
> https://bugs.llvm.org/show_bug.cgi?id=30543 for the most recent work in
> this area.

why is using -mfloat-abi weird?

And I do not agree that whether functions are treated as builtin should 
depend on -ffast-math.
(it does not with gcc)

Obviously rL291909 is related. It explicitly states -mfloat-abi should 
be ignored for builtins.
(It does not actually ignore -mfloat-abi unless -ffast-math is specified)

I do not think this is how gcc actually behaves.
Gcc does not ignore -mfloat-abi. This really changes the calling 
conventions for builtin functions to VFP.

And with gcc -ffast-math does *not* change CC. I tested this with the 
latest binary I could find (gcc version 6.3.1 20170215):

https://developer.arm.com/-/media/Files/downloads/gnu-rm/6_1-2017q1/gcc-arm-none-eabi-6-2017-q1-update-win32.exe?product=GNU%20ARM%20Embedded%20Toolchain,32-
bit,,Windows,6-2017-q1-update

Here is a more elaborate example of the problem:

fail.c:

   extern float sinf (float x);
   float sin1 (float x) {return (sinf (x) + 1.0);}

arm-none-eabi-gcc.exe -O2 -mfloat-abi=hard -ffast-math -S fail.c -o -

   sin1:
     push    {r4, lr}
     bl      sinf
     vldr.32 s15, .L3
     pop     {r4, lr}
     vadd.f32        s0, s0, s15
     bx      lr

VFP CC (and optimal code).

Compiled with clang/llvm 4.0.0:
clang.exe -target armv7a-none-none-eabi -O2 -mfloat-abi=hard -ffast-math 
-S fail.c -o -

   sin1:
     push    {r11, lr}
     mov     r11, sp
     vmov    r0, s0
     bl      sinf
     vmov.f32        d16, #1.000000e+00
     vmov    d17, r0, r0
     vadd.f32        d0, d17, d16
     pop     {r11, pc}

Using non-VFP CC - incompatible with the gcc code.

Finally, using eabihf or -fno-fast-math corrects the problem:

clang.exe -target armv7a-none-none-eabihf -O2 -mfloat-abi=hard 
-ffast-math -S fail.c -o -

   sin1:
     push    {r11, lr}
     mov     r11, sp
     vpush   {d8}
     vmov.f32        d8, #1.000000e+00
     bl      sinf
     vadd.f32        d0, d0, d8
     vpop    {d8}
     pop     {r11, pc}

This is using VFP calling conventions as expected.

(I don't like llvm's handling of r11 and d8, though - the optimizer 
needs to be optimized...:-)

IMHO the current handling is incompatible with gcc. (and with previous 
releases of clang)

Cheers,

Peter