[llvm] [NVPTX] Support llvm.{exp2, log2} for f32/f16/bf16 and vectors (PR #120519)

Thu Dec 19 16:07:17 PST 2024

Artem-B wrote:

Hmm. Looking at the front-end, I see that we forward `exp2f` to `__nv_exp2f`:
https://github.com/llvm/llvm-project/blob/6f983f88537415952ec528c42f89f1d5b620fe68/clang/lib/Headers/__clang_cuda_math.h#L112

The interesting part is that `__nv_exp2f` in libdevice implements things via `@llvm.nvvm.ex2.approx...`

```
; Function Attrs: alwaysinline nounwind
define float @__nv_exp2f(float %x) #0 {
  %1 = call i32 @__nvvm_reflect(ptr @.str) #6
  %2 = icmp ne i32 %1, 0
  br i1 %2, label %3, label %5

3:                                                ; preds = %0
  %4 = call float @llvm.nvvm.ex2.approx.ftz.f(float %x) #6
  br label %__exp2f.exit

5:                                                ; preds = %0
  %6 = call float @llvm.nvvm.ex2.approx.f(float %x) #6
  br label %__exp2f.exit

__exp2f.exit:                                     ; preds = %3, %5
  %.0 = phi float [ %4, %3 ], [ %6, %5 ]
  ret float %.0
}
```

Considering that CUDA has been living with that implementation all this time, perhaps *that'* is the way we should handle things here, too. In other words using ex2.approx may be OK unconditionally, after all.

https://github.com/llvm/llvm-project/pull/120519