[llvm] [NVPTX] Improve NVVMReflect Efficiency (PR #134416)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 4 13:16:45 PDT 2025


Artem-B wrote:

@AlexMaclean do you think we could reuse intrinsic autoupgrade machinery for this, instead of making reflect processing more complicated?

It could be somewhat useful for other purposes.
E.g. we could introduce a const-foldable (when we know the GPU we're targeting) `nvvm.arch()` which would return __CUDA_ARCH__ value and upgrade nvvm_reflect to it. Bonus point is that it would also be useful for IR users to parametrize their code without relying on NVVMReflect.

@jhuber6 would something like that help with some of your offloading cases. I recall you did run into trouble with NVVMReflect a while back.


https://github.com/llvm/llvm-project/pull/134416


More information about the llvm-commits mailing list