[clang] [Clang] Add float type support to __builtin_reduce_add and __builtin_reduce_multipy (PR #120367)

Fri Jan 3 10:38:14 PST 2025

RKSimon wrote:

> > fp reductions are a nightmare - every time I thought we were getting somewhere, something else fastmath related causes more headaches.
> 
> @RKSimon should I not proceed with this PR? I was really hoping to use this in HLSL as it makes implementing many of our runtime apis easy in the frontend.

I'd definitely suggest you reduce the scope - maybe always expand the reduction serially in the frontend and not create the llvm intrinsic at all? Or maybe an alternative `__builtin_reduce_fastmath_fadd / fmul` that makes it clear whats happening, and have them always emit the reassoc variants of the reductions? There was a lot of resistance to reductions that change behaviour depending on which fast math flags were enabled. 

CC @andykaylor @jcranmer-intel 

https://github.com/llvm/llvm-project/pull/120367