[llvm] Enable .ptr .global .align attributes for kernel attributes for CUDA (PR #114874)
Lewis Crawford via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 8 08:19:00 PST 2024
================
@@ -1600,29 +1600,37 @@ void NVPTXAsmPrinter::emitFunctionParamList(const Function *F, raw_ostream &O) {
if (isKernelFunc) {
if (PTy) {
- // Special handling for pointer arguments to kernel
O << "\t.param .u" << PTySizeInBits << " ";
- if (static_cast<NVPTXTargetMachine &>(TM).getDrvInterface() !=
- NVPTX::CUDA) {
- int addrSpace = PTy->getAddressSpace();
- switch (addrSpace) {
- default:
- O << ".ptr ";
- break;
- case ADDRESS_SPACE_CONST:
- O << ".ptr .const ";
- break;
- case ADDRESS_SPACE_SHARED:
- O << ".ptr .shared ";
- break;
- case ADDRESS_SPACE_GLOBAL:
- O << ".ptr .global ";
- break;
- }
- Align ParamAlign = I->getParamAlign().valueOrOne();
- O << ".align " << ParamAlign.value() << " ";
+ int addrSpace = PTy->getAddressSpace();
+ const bool IsCUDA =
+ static_cast<NVPTXTargetMachine &>(TM).getDrvInterface() ==
+ NVPTX::CUDA;
+
+ O << ".ptr ";
+ switch (addrSpace) {
+ default:
+ // Special handling for pointer arguments to kernel
+ // CUDA kernels assume that pointers are in global address space
+ // See:
+ // https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parameter-state-space
+ if (IsCUDA)
+ O << " .global ";
+ break;
+ case ADDRESS_SPACE_CONST:
+ O << " .const ";
+ break;
+ case ADDRESS_SPACE_SHARED:
+ O << " .shared ";
+ break;
+ case ADDRESS_SPACE_GLOBAL:
+ O << " .global ";
+ break;
}
+
+ Align ParamAlign = I->getParamAlign().valueOrOne();
+ if (ParamAlign != 1 || !IsCUDA)
----------------
LewisCrawford wrote:
Ok, I've changed to always emit .align 1 if no explicit alignment is specified (for both CUDA and OpenCL).
The closest thing I've found in the langref to explaining the default behaviour is here: https://llvm.org/docs/LangRef.html#paramattrs
> Note that align 1 has no effect on non-byval, non-preallocated arguments.
As far as I can tell, if "align 1 has no effect", that sounds like no-alignment specified == align 1 == no assumptions made about alignment.
https://github.com/llvm/llvm-project/pull/114874
More information about the llvm-commits
mailing list