[llvm] [NVPTX] Select bfloat16 add/mul/sub as fma on SM80 (PR #121065)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 9 15:19:39 PST 2025


Artem-B wrote:

After checking the PTX spec, the question I've got is -- should we bother with this attempt to lower those ops to FMA at all? 

It will only be beneficial on `sm_80` for PTX versions between 7.0 to 7.7 (CUDA-11.1 to 11.7).

These days nobody should be using anything older than CUDA-11.8 and even that is firmly on the way out.
I do not really see much point adding something to support a use case that's already obsolete.


https://github.com/llvm/llvm-project/pull/121065


More information about the llvm-commits mailing list