[llvm] Enable .ptr .global .align attributes for kernel attributes for CUDA (PR #114874)
Lewis Crawford via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 14 08:03:00 PST 2025
LewisCrawford wrote:
This appears to be a limitation of the CUDA API. In CUDA, only `.global` and generic pointers are valid kernel args. The other address spaces like `.local` and `.shared` are valid PTX, but only in the context of the OpenCL API, which is why this compiles fine in ptxas, but fails to load within CUDA.
I should add some error-checking to the code here to make this clearer, and update the tests.
https://github.com/llvm/llvm-project/pull/114874
More information about the llvm-commits
mailing list