[llvm] [NVPTX] Custom lower ADDRSPACECAST (PR #125607)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 3 17:38:29 PST 2025
Artem-B wrote:
> Are there any cases where an addrspacecast like this and the PTX we're emitting after this change would be well defined?
It depends on how much we'd be willing to peek under the hood and rely on ptxas implementation details.
https://docs.nvidia.com/cuda/parallel-thread-execution/#generic-addressing says:
> The state spaces .const, [Kernel Function Parameters](https://docs.nvidia.com/cuda/parallel-thread-execution/#kernel-function-parameters) (.param), .local and .shared are modeled as windows within the generic address space. Each window is defined by a window base and a window size that is equal to the size of the corresponding state space. A generic address maps to global memory unless it falls within the window for const, local, or shared memory. The [Kernel Function Parameters](https://docs.nvidia.com/cuda/parallel-thread-execution/#kernel-function-parameters) (.param) window is contained within the .global window. Within each window, a generic address maps to an address in the underlying state space by subtracting the window base from the generic address.
The way I see it, ASC from a global pointer to some other AS *may* happen to work, because global->local conversion may, effectively, be a no-op, assuming that PTX conversion does not actually do any checking on the actual pointer value, and trusts us that the input is indeed in global AS.
Someone could get a shared pointer in generic AS, then convert it to integer, and then pass around as a global pointer. In that case global->generic->shared would happen to work, but it would depend on too many implementation details of how ptxas handles AS conversion operations. The spec does not give us any promises on that.
I guess the short answer is if we know that we can't generate sensible code for the given IR, we should diagnose it, the sooner, the better. LLVM sort of assumes that IR is not only syntactically valid, but is also "sensible" for the target. One would not expect compiler to do anything sensible with the syntactivally valid IR that loads from AS(1234567) on all targets, and ASCs that aren't supported by NVPTX fall into the same category.
https://github.com/llvm/llvm-project/pull/125607
More information about the llvm-commits
mailing list