[llvm] [NVPTX] Support llvm.{exp2, log2} for f32/f16/bf16 and vectors (PR #120519)

Fri Dec 20 08:27:49 PST 2024

jhuber6 wrote:

> NVIDIA's docs only [say ](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=precision#floating-point-instructions-ex2)"The maximum absolute error is 2^-22.5 for fraction in the primary range.".
> 
> However, there's a post on NVIDIA's forum, claiming the current implementation error to be 2.38344 ulp: https://forums.developer.nvidia.com/t/more-accurate-version-of-exp2f-with-no-change-in-performance/243209
> 
> Both nvcc and clang currently translate `exp2f` into `ex2.approx`. I think that should be fine. To be on the safe side, we can keep the gating option, but set the default to be enabled.

It would be really nice to have some tests for verifying mathematical precision in LLVM somewhere.

https://github.com/llvm/llvm-project/pull/120519