[llvm] [NVPTX] Improve NVVMReflect Efficiency (PR #134416)
Joseph Huber via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 4 13:37:34 PDT 2025
jhuber6 wrote:
> @AlexMaclean do you think we could reuse intrinsic autoupgrade machinery for this, instead of making reflect processing more complicated?
>
> It could be somewhat useful for other purposes. E.g. we could introduce a const-foldable (when we know the GPU we're targeting) `nvvm.arch()` which would return **CUDA_ARCH** value and upgrade nvvm_reflect to it. Bonus point is that it would also be useful for IR users to parametrize their code without relying on NVVMReflect.
>
> @jhuber6 would something like that help with some of your offloading cases. I recall you did run into trouble with NVVMReflect a while back.
All that really matters for correctness is that this pass is always run and it always does constant prop + DCE when it's used as an edge directly.
https://github.com/llvm/llvm-project/pull/134416
More information about the llvm-commits
mailing list