[llvm] [NVPTX] Improve support for {ex2,lg2}.approx (PR #120519)

Tue Jan 21 11:20:36 PST 2025

jhuber6 wrote:

> @Prince781 It appears that the tests are generating 32-bit PTX and it's no longer supported by recent CUDA versions.
> 
> ```
> [  1] ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
> [  2] ; RUN: llc < %s -mcpu=sm_20 -mattr=+ptx32 | FileCheck --check-prefixes=CHECK %s [OK]
> llc < third_party/llvm/llvm-project/llvm/test/CodeGen/NVPTX/f32-lg2.ll -mcpu=sm_20 -mattr=+ptx32 | third_party/llvm/llvm-project/llvm/FileCheck --allow-unused-prefixes --check-prefixes=CHECK third_party/llvm/llvm-project/llvm/test/CodeGen/NVPTX/f32-lg2.ll
> [  3] ; RUN: %if ptxas %{ llc < %s -mcpu=sm_20 -mattr=+ptx32 | %ptxas-verify %} [FAIL]
>  llc < third_party/llvm/llvm-project/llvm/test/CodeGen/NVPTX/f32-lg2.ll -mcpu=sm_20 -mattr=+ptx32 | third_party/gpus/cuda/_virtual_includes/_stage_runtime/third_party/gpus/cuda/bin/ptxas -arch=sm_60 -c -o /dev/null - 
> ptxas warning :  64 Bit host architecture (--machine) being used mismatches with .address_size of 32 bits
> ptxas fatal   :  32-Bit compilation is no longer supported
> Command failed: exit status 255
> ```
> 
> You can reproduce it by running the tests with `LLVM_PTXAS_EXECUTABLE=/path/to/cuda-12.6.0/bin/ptxas`

The triple is just missing `64`, I can probably fix it along with something else.

https://github.com/llvm/llvm-project/pull/120519