[compiler-rt] [compiler-rt][ARM] Optimized f32 add/subtract for Armv6-M. (PR #154093)

Tue Aug 19 03:51:35 PDT 2025

statham-arm wrote:

> Do you happen to have a figure for code-size? One possible objection is someone preferring a smallest possible implementation for M0 at the expense of performance.

You're right, the code size is bigger in this implementation. The new `addsf3.S` assembles to 648 bytes of code, and (at `-Os`) another 68 bytes for the helper `fnan2.c`. The old version was 312 bytes for `addsf3.S` and 22 bytes for the `subsf3.c` wrapper.

> Presumably the denormal issue you found was unique to the existing Arm assembly implementation and not the general C implementation (otherwise the tests would fail on non v6-m platforms)?

Yes, the C version in `lib/builtins/addsf3.c` (well, really `lib/builtins/fp_add_impl.inc`) passes the new test in full. (Or rather, correctly skips all the NaN test cases and passes the rest.)

https://github.com/llvm/llvm-project/pull/154093