[clang] [Clang] Add float type support to __builtin_reduce_add and __builtin_reduce_multipy (PR #120367)

Mon Jan 6 02:07:53 PST 2025

Il-Capitano wrote:

`__builtin_reduce_add/mul` is defined to do _recursive even-odd pairwise reduction_ in Clang, and the `@llvm.vector.reduce.fadd` intrinsic doesn't do that.

I'm not sure if there's any benefit to this definition over a sequential reduction. The only thing I can think that could be affected is signed-integer overflow being UB, but that doesn't seem to be optimized for, or checked with UBSan: https://godbolt.org/z/TEsc86rf5.

Maybe, as RKSimon suggested, having a separate builtin for floating-point types could be better? Something like `__builtin_reduce_fadd/fmul` that does sequential reduction, and another one for unordered reduction?  
This builtin could also take a start value as the first argument to better match the LLVM intrinsic.

> Or maybe an alternative `__builtin_reduce_fastmath_fadd / fmul` that makes it clear whats happening, and have them always emit the reassoc variants of the reductions?

I think `__builtin_reduce_fadd_unordered` or something similar would be a better choice, since "fastmath" implies more restrictions/assumptions than `reassoc` (no nans, no signed zeros, etc.).

https://github.com/llvm/llvm-project/pull/120367