[PATCH] D138531: [PATCH] [NVPTX] Backend support for variadic functions

Tue May 16 13:57:12 PDT 2023

tra added a comment.

There's an interesting discrepancy between what the PTX spec says and what NVCC does.

PTX spec (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#kernel-and-function-directives-func) allows passing vararg arguments as an unsized array parameter to the function. According to the same spec "Parameters in .param space are accessed using ld.param and st.param instructions in the body."
However, when I look at the code generated by nvcc, it appears that it uses ld.local to access vararg parameters: https://godbolt.org/z/qh4rq5xxK

I'm talking to NVIDIA folks and they seem to struggle to address the discrepancy. Considering that the local access for variadics has been there from the very beginning and that there's probably no other viable ways to pass arbitrary amount of data to a thread, local memory is probably the only choice. This suggests that they probably just forgot to document this quirk.

Can you elaborate on what was your reason for lowering va_arg as a local AS access? Was it to mimic what NVCC does, or is this documented somewhere.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138531/new/

https://reviews.llvm.org/D138531