[clang] [Cuda] Handle -fcuda-short-ptr even with -nocudalib (PR #111682)
Fraser Cormack via cfe-commits
cfe-commits at lists.llvm.org
Wed Oct 9 07:18:02 PDT 2024
frasercrmck wrote:
> Seems reasonable, which architectures require this? I know that NVIDIA deprecated the 32-bit `nvptx` target in CUDA 12 or something.
I'm not an expert on CUDA but, AFAICT, even in 64-bit CUDA, certain pointers such as those pointing to shared memory are 32 bit, because the size of shared memory is somewhere in the kB range. This generates better code, fewer registers, etc. I'm not sure why the option isn't enabled by default, personally - it seems like `nvcc` is doing this by default.
I was just playing with the option downstream and noticed this issue.
https://github.com/llvm/llvm-project/pull/111682
More information about the cfe-commits
mailing list