[llvm] [NVPTX] Improve device function byval parameter lowering (PR #129188)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 28 12:18:12 PST 2025
Artem-B wrote:
Counter-example, where PTX docs allow mov + ld.param (though only for kernel arguments): https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#kernel-function-parameters
> The address of a kernel parameter may be moved into a register using the mov instruction. The resulting address is in the .param state space and is accessed using ld.param instructions.
```
Example
.entry bar ( .param .b32 len )
{
.reg .u32 %ptr, %n;
mov.u32 %ptr, len;
ld.param.u32 %n, [%ptr];
...
```
https://github.com/llvm/llvm-project/pull/129188
More information about the llvm-commits
mailing list