[llvm] [NVPTX] cap param alignment at 128 (max supported by ptx) (PR #96117)

Alex MacLean via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 19 14:55:23 PDT 2024


https://github.com/AlexMaclean created https://github.com/llvm/llvm-project/pull/96117

Cap the alignment to 128 bytes as that is the maximum alignment supported by PTX. The restriction is mentioned in the parameter passing section (Note D) of the [PTX Writer's Guide to Interoperability](https://docs.nvidia.com/cuda/ptx-writers-guide-to-interoperability/index.html#parameter-passing)

> D. The alignment must be 1, 2, 4, 8, 16, 32, 64, or 128 bytes.

>From fd680eac8ed28f1032e641c8b26f922837824548 Mon Sep 17 00:00:00 2001
From: Alex MacLean <amaclean at nvidia.com>
Date: Fri, 14 Jun 2024 17:15:59 +0000
Subject: [PATCH] [NVPTX] cap param alignment at 128 (max supported by ptx)

---
 llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp |  8 +++++---
 llvm/test/CodeGen/NVPTX/max-align.ll        | 14 ++++++++++++++
 2 files changed, 19 insertions(+), 3 deletions(-)
 create mode 100644 llvm/test/CodeGen/NVPTX/max-align.ll

diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index f4ef7c9914f13..982c191875750 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -5038,7 +5038,9 @@ bool NVPTXTargetLowering::getTgtMemIntrinsic(
 /// ensures that alignment is 16 or greater.
 Align NVPTXTargetLowering::getFunctionParamOptimizedAlign(
     const Function *F, Type *ArgTy, const DataLayout &DL) const {
-  const uint64_t ABITypeAlign = DL.getABITypeAlign(ArgTy).value();
+  // Capping the alignment to 128 bytes as that is the maximum alignment
+  // supported by PTX.
+  const Align ABITypeAlign = std::min(Align(128), DL.getABITypeAlign(ArgTy));
 
   // If a function has linkage different from internal or private, we
   // must use default ABI alignment as external users rely on it. Same
@@ -5048,10 +5050,10 @@ Align NVPTXTargetLowering::getFunctionParamOptimizedAlign(
                          /*IgnoreCallbackUses=*/false,
                          /*IgnoreAssumeLikeCalls=*/true,
                          /*IgnoreLLVMUsed=*/true))
-    return Align(ABITypeAlign);
+    return ABITypeAlign;
 
   assert(!isKernelFunction(*F) && "Expect kernels to have non-local linkage");
-  return Align(std::max(uint64_t(16), ABITypeAlign));
+  return std::max(Align(16), ABITypeAlign);
 }
 
 /// Helper for computing alignment of a device function byval parameter.
diff --git a/llvm/test/CodeGen/NVPTX/max-align.ll b/llvm/test/CodeGen/NVPTX/max-align.ll
new file mode 100644
index 0000000000000..c8b1cb12dee5f
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/max-align.ll
@@ -0,0 +1,14 @@
+; RUN: llc < %s -march=nvptx64 -O0 | FileCheck %s
+; RUN: %if ptxas %{ llc < %s -march=nvptx64 -O0 | %ptxas-verify %}
+
+
+; CHECK: .visible .func  (.param .align 128 .b8 func_retval0[256]) repro()
+define <64 x i32> @repro() {
+
+  ; CHECK: .param .align 128 .b8 retval0[256];
+  %1 = tail call <64 x i32> @test(i32 0)
+  ret <64 x i32> %1
+}
+
+; Function Attrs: nounwind
+declare <64 x i32> @test(i32)



More information about the llvm-commits mailing list