[llvm] [NVPTX] Annotate CUDA kernel pointer arguments with .ptr .space .align attributes. (PR #79646)

Mon Feb 26 17:05:03 PST 2024

================
@@ -0,0 +1,34 @@
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_72 2>&1 | FileCheck %s
+; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcpu=sm_72 | %ptxas-verify %}
+
+%struct.Large = type { [16 x double] }
+
+; CHECK: .param .u64 .ptr .global .align 16 func_align_param_0,
+; CHECK: .param .u64 func_align_param_1,
----------------
Vandana2896 wrote:

.ptr isn't annotated for second argument in CUDA as seen in the code. Currently for CUDA, we are only supporting .global .ptr for global memory objects in this scope of PR for CUDA. 

@Artem-B there are 2 issues with this code that I had overseen. 
1. For the second argument in both cases, the current code incorrectly is showing .global .ptr even though the pointer is not pointing to objects in global memory space since we have no check for global memory space apart from CUDA check.
2. For the same second argument in both the cases, I see .global .ptr .align 0, the alignment specified as 0 since we are assigning alignment with getParamAlign() that assumes alignment is specified. 

To resolve both the issues, I think we can stick with the initial scope of this PR to assign .global .ptr to pointers whose alignment is specified as ptr nocapture readonly align 16. For this, I think reverting this change to the initial commit - [https://github.com/llvm/llvm-project/pull/79646/commits/d5bd0215f22440e10e1a4af2b4391973831795be](https://github.com/llvm/llvm-project/pull/79646/commits/d5bd0215f22440e10e1a4af2b4391973831795be) can resolve both the issues. 

Any thoughts on this? 

https://github.com/llvm/llvm-project/pull/79646