[llvm] [NVPTX] Support llvm.{exp2, log2} for f32 and vector of f32 (PR #120519)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 19 11:21:41 PST 2024
Artem-B wrote:
We have explicit flags to enable approximate reciprocal and sqrt and these instructions should follow a similar pattern.
I agree that enabling them automatically for fast-math may be confusing (though it may be worth checking if we have similar situations on other platforms that could give us some guidelines on how to handle this)
Letting the user enable these instructions explicitly should work.
Letting compiler generate low-precision results will likely break things at runtime (there's a lot of existing code assuming that host/device compilations will produce nearly identical result). I'd prefer things to fail early, in a painfully obvious way if compiler can't do something correctly.
https://github.com/llvm/llvm-project/pull/120519
More information about the llvm-commits
mailing list