[llvm] [SLP]Initial FMAD support (PR #149102)

Sun Aug 10 07:57:13 PDT 2025

davemgreen wrote:

> > A note we saw some fallout from this in internal performance testing too. Something like this example doing a reduce of a fmul under fast-math is no longer vectorizing from the SLP vectorizer: https://godbolt.org/z/rYWM7dxEj. It wasn't helped on AArch64 by a different set of cost calculations that marked a fmul the same cost as a fma, but that example is x86. The original fadds in a reduction can be combined into a fma, but the expanded reduction will still become fma for part of it.
> 
> There should be a follow up patch to support fma-based reduction

Sounds great. The performance regressions were pretty large and people here wouldn't be happy with something so large being broken for any length of time. Its a multiply accumulate after all, they come up everywhere. I've added some phase-ordering tests in a976843033485ff44bb4bbb0b0b8a537956b4c40.

https://github.com/llvm/llvm-project/pull/149102