[llvm] [NVPTX] Select bfloat16 add/mul/sub as fma on SM80 (PR #121065)

Thu Jan 9 15:19:39 PST 2025

Artem-B wrote:

After checking the PTX spec, the question I've got is -- should we bother with this attempt to lower those ops to FMA at all? 

It will only be beneficial on `sm_80` for PTX versions between 7.0 to 7.7 (CUDA-11.1 to 11.7).

These days nobody should be using anything older than CUDA-11.8 and even that is firmly on the way out.
I do not really see much point adding something to support a use case that's already obsolete.

https://github.com/llvm/llvm-project/pull/121065