[llvm] [AMDGPU] Fix module split's assumption on kernels (PR #116280)

Siu Chi Chan via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 14 13:02:56 PST 2024


https://github.com/scchan created https://github.com/llvm/llvm-project/pull/116280

Module split assumes that a kernel function must have an external linkage; however, that isn't the case.  For example, a static kernel function will have a weak_odr linkage

Change-Id: I1e5dee0de1fd866b365f4090a574e1b2961f8dca

>From b70e6fc2eb988a4b0a56883f94cea45175bdb4b4 Mon Sep 17 00:00:00 2001
From: Siu Chi Chan <siuchi.chan at amd.com>
Date: Thu, 14 Nov 2024 01:01:44 +0000
Subject: [PATCH] [AMDGPU] Fix module split's assumption on kernels

Module split assumes that a kernel function must have an external
linkage; however, that isn't the case.  For example, a static kernel
function will have a weak_odr linkage

Change-Id: I1e5dee0de1fd866b365f4090a574e1b2961f8dca
---
 llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp          | 11 +++++------
 .../tools/llvm-split/AMDGPU/large-kernels-merging.ll  |  6 +++---
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
index 5d7aff1c5092cc..1942c704270c1c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
@@ -157,13 +157,12 @@ static auto formatRatioOf(CostType Num, CostType Dem) {
 /// Non-copyable functions cannot be cloned into multiple partitions, and only
 /// one copy of the function can be present across all partitions.
 ///
-/// External functions fall into this category. If we were to clone them, we
-/// would end up with multiple symbol definitions and a very unhappy linker.
+/// Kernel functions and external functions fall into this category. If we were 
+/// to clone them, we would end up with multiple symbol definitions and a very 
+/// unhappy linker.
 static bool isNonCopyable(const Function &F) {
-  assert(AMDGPU::isEntryFunctionCC(F.getCallingConv())
-             ? F.hasExternalLinkage()
-             : true && "Kernel w/o external linkage?");
-  return F.hasExternalLinkage() || !F.isDefinitionExact();
+  return AMDGPU::isEntryFunctionCC(F.getCallingConv()) ||
+         F.hasExternalLinkage() || !F.isDefinitionExact();
 }
 
 /// If \p GV has local linkage, make it external + hidden.
diff --git a/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll b/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
index 807fb2e5f33cea..e40e8b96cd8d4d 100644
--- a/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
+++ b/llvm/test/tools/llvm-split/AMDGPU/large-kernels-merging.ll
@@ -19,7 +19,7 @@
 ; CHECK0: declare
 
 ; CHECK1: define internal void @HelperC()
-; CHECK1: define amdgpu_kernel void @C
+; CHECK1: define weak_odr amdgpu_kernel void @C
 
 ; CHECK2: define internal void @large2()
 ; CHECK2: define internal void @large1()
@@ -30,7 +30,7 @@
 ; CHECK2: define amdgpu_kernel void @B
 
 ; NOLARGEKERNELS-CHECK0: define internal void @HelperC()
-; NOLARGEKERNELS-CHECK0: define amdgpu_kernel void @C
+; NOLARGEKERNELS-CHECK0: define weak_odr amdgpu_kernel void @C
 
 ; NOLARGEKERNELS-CHECK1: define internal void @large2()
 ; NOLARGEKERNELS-CHECK1: define internal void @large1()
@@ -88,7 +88,7 @@ define internal void @HelperC() {
   ret void
 }
 
-define amdgpu_kernel void @C() {
+define weak_odr amdgpu_kernel void @C() {
   call void @HelperC()
   ret void
 }



More information about the llvm-commits mailing list