[llvm] [NVPTX] Support llvm.{exp2, log2} for f32/f16/bf16 and vectors (PR #120519)
Joseph Huber via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 20 08:27:49 PST 2024
jhuber6 wrote:
> NVIDIA's docs only [say ](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=precision#floating-point-instructions-ex2)"The maximum absolute error is 2^-22.5 for fraction in the primary range.".
>
> However, there's a post on NVIDIA's forum, claiming the current implementation error to be 2.38344 ulp: https://forums.developer.nvidia.com/t/more-accurate-version-of-exp2f-with-no-change-in-performance/243209
>
> Both nvcc and clang currently translate `exp2f` into `ex2.approx`. I think that should be fine. To be on the safe side, we can keep the gating option, but set the default to be enabled.
It would be really nice to have some tests for verifying mathematical precision in LLVM somewhere.
https://github.com/llvm/llvm-project/pull/120519
More information about the llvm-commits
mailing list