[PATCH] D29485: [Builtin][ARM] Implement addsf3/__aeabi_fadd for Thumb1

Weiming Zhao via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 3 23:45:03 PST 2017


weimingz added a comment.

My previous comment of "136 bytes of stack usage" is inaccurate. I just find the default build doesn't have any -O flag. Is it an issue without a default -O flag?

Internally, we use -Os.
Using -Os, it uses 24 bytes in stack for spilling.

Regarding constants, it uses the following: (some I can figure out easily)
 238:   7fffffff        .word   0x7fffffff   ==> this mask is to get the Abs value. In Asm, we do lsl and lsr. Since we need to do Abs twice, compiler may think this is better. But ld is more expensive.
 23c:   007fffff        .word   0x007fffff  ===> similar mask. Used twice
 240:   55555555        .word   0x55555555 ==> no idea
 244:   33333333        .word   0x33333333 ==> no idea
 248:   0f0f0f0f        .word   0x0f0f0f0f ==> no idea
 24c:   01010101        .word   0x01010101 ==> no idea
 250:   7fc00000        .word   0x7fc00000 ==> Similar mask

On code size, addsf3.c.o (using -Os) vs addsf3.S.o is 604 vs 312 bytes.
For performance, on a armv7 Android device, it's 1.4x faster using my own micro benchmark. I don't have the number on a cortex-m0. But on real project, this asm version let us close the 25% gap against gcc.


https://reviews.llvm.org/D29485





More information about the llvm-commits mailing list