[llvm] [NVPTX] Support llvm.{exp2, log2} for f32/f16/bf16 and vectors (PR #120519)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 19 16:07:17 PST 2024
Artem-B wrote:
Hmm. Looking at the front-end, I see that we forward `exp2f` to `__nv_exp2f`:
https://github.com/llvm/llvm-project/blob/6f983f88537415952ec528c42f89f1d5b620fe68/clang/lib/Headers/__clang_cuda_math.h#L112
The interesting part is that `__nv_exp2f` in libdevice implements things via `@llvm.nvvm.ex2.approx...`
```
; Function Attrs: alwaysinline nounwind
define float @__nv_exp2f(float %x) #0 {
%1 = call i32 @__nvvm_reflect(ptr @.str) #6
%2 = icmp ne i32 %1, 0
br i1 %2, label %3, label %5
3: ; preds = %0
%4 = call float @llvm.nvvm.ex2.approx.ftz.f(float %x) #6
br label %__exp2f.exit
5: ; preds = %0
%6 = call float @llvm.nvvm.ex2.approx.f(float %x) #6
br label %__exp2f.exit
__exp2f.exit: ; preds = %3, %5
%.0 = phi float [ %4, %3 ], [ %6, %5 ]
ret float %.0
}
```
Considering that CUDA has been living with that implementation all this time, perhaps *that'* is the way we should handle things here, too. In other words using ex2.approx may be OK unconditionally, after all.
https://github.com/llvm/llvm-project/pull/120519
More information about the llvm-commits
mailing list