[llvm] [NVPTX] Select bfloat16 add/mul/sub as fma on SM80 (PR #121065)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 9 15:19:39 PST 2025
Artem-B wrote:
After checking the PTX spec, the question I've got is -- should we bother with this attempt to lower those ops to FMA at all?
It will only be beneficial on `sm_80` for PTX versions between 7.0 to 7.7 (CUDA-11.1 to 11.7).
These days nobody should be using anything older than CUDA-11.8 and even that is firmly on the way out.
I do not really see much point adding something to support a use case that's already obsolete.
https://github.com/llvm/llvm-project/pull/121065
More information about the llvm-commits
mailing list