[llvm] [AMDGPU] Fix module split's assumption on kernels (PR #116280)
Siu Chi Chan via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 14 13:02:56 PST 2024
https://github.com/scchan created https://github.com/llvm/llvm-project/pull/116280
Module split assumes that a kernel function must have an external linkage; however, that isn't the case. For example, a static kernel function will have a weak_odr linkage
Change-Id: I1e5dee0de1fd866b365f4090a574e1b2961f8dca
>From b70e6fc2eb988a4b0a56883f94cea45175bdb4b4 Mon Sep 17 00:00:00 2001
From: Siu Chi Chan <siuchi.chan at amd.com>
Date: Thu, 14 Nov 2024 01:01:44 +0000
Subject: [PATCH] [AMDGPU] Fix module split's assumption on kernels
Module split assumes that a kernel function must have an external
linkage; however, that isn't the case. For example, a static kernel
function will have a weak_odr linkage
Change-Id: I1e5dee0de1fd866b365f4090a574e1b2961f8dca
---
llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp | 11 +++++------
.../tools/llvm-split/AMDGPU/large-kernels-merging.ll | 6 +++---
2 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
index 5d7aff1c5092cc..1942c704270c1c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
@@ -157,13 +157,12 @@ static auto formatRatioOf(CostType Num, CostType Dem) {
/// Non-copyable functions cannot be cloned into multiple partitions, and only
/// one copy of the function can be present across all partitions.
///
-/// External functions fall into this category. If we were to clone them, we
-/// would end up with multiple symbol definitions and a very unhappy linker.
+/// Kernel functions and external functions fall into this category. If we were
+/// to clone them, we would end up with multiple symbol definitions and a very
+/// unhappy linker.
static bool isNonCopyable(const Function &F) {
- assert(AMDGPU::isEntryFunctionCC(F.getCallingConv())
- ? F.hasExternalLinkage()
- : true && "Kernel w/o external linkage?");
- return F.hasExternalLinkage() || !F.isDefinitionExact();
+ return AMDGPU::isEntryFunctionCC(F.getCallingConv()) ||
+ F.hasExternalLinkage() || !F.isDefinitionExact();
}
/// If \p GV has local linkage, make it external + hidden.
diff --git a/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll b/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
index 807fb2e5f33cea..e40e8b96cd8d4d 100644
--- a/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
+++ b/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
@@ -19,7 +19,7 @@
; CHECK0: declare
; CHECK1: define internal void @HelperC()
-; CHECK1: define amdgpu_kernel void @C
+; CHECK1: define weak_odr amdgpu_kernel void @C
; CHECK2: define internal void @large2()
; CHECK2: define internal void @large1()
@@ -30,7 +30,7 @@
; CHECK2: define amdgpu_kernel void @B
; NOLARGEKERNELS-CHECK0: define internal void @HelperC()
-; NOLARGEKERNELS-CHECK0: define amdgpu_kernel void @C
+; NOLARGEKERNELS-CHECK0: define weak_odr amdgpu_kernel void @C
; NOLARGEKERNELS-CHECK1: define internal void @large2()
; NOLARGEKERNELS-CHECK1: define internal void @large1()
@@ -88,7 +88,7 @@ define internal void @HelperC() {
ret void
}
-define amdgpu_kernel void @C() {
+define weak_odr amdgpu_kernel void @C() {
call void @HelperC()
ret void
}
More information about the llvm-commits
mailing list