[llvm] [NVPTX] Improve NVVMReflect Efficiency (PR #134416)

Joseph Huber via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 4 13:37:34 PDT 2025


jhuber6 wrote:

> @AlexMaclean do you think we could reuse intrinsic autoupgrade machinery for this, instead of making reflect processing more complicated?
> 
> It could be somewhat useful for other purposes. E.g. we could introduce a const-foldable (when we know the GPU we're targeting) `nvvm.arch()` which would return **CUDA_ARCH** value and upgrade nvvm_reflect to it. Bonus point is that it would also be useful for IR users to parametrize their code without relying on NVVMReflect.
> 
> @jhuber6 would something like that help with some of your offloading cases. I recall you did run into trouble with NVVMReflect a while back.

All that really matters for correctness is that this pass is always run and it always does constant prop + DCE when it's used as an edge directly.

https://github.com/llvm/llvm-project/pull/134416


More information about the llvm-commits mailing list