[llvm] 4a1236e - [AMDGPU] Add an option to disable manual ctor / dtor lowering

Tue May 23 07:03:20 PDT 2023

Author: Joseph Huber
Date: 2023-05-23T09:03:10-05:00
New Revision: 4a1236e0f66341c1fcf2cd8a3e9a1b6faaf1a4d5

URL: https://github.com/llvm/llvm-project/commit/4a1236e0f66341c1fcf2cd8a3e9a1b6faaf1a4d5
DIFF: https://github.com/llvm/llvm-project/commit/4a1236e0f66341c1fcf2cd8a3e9a1b6faaf1a4d5.diff

LOG: [AMDGPU] Add an option to disable manual ctor / dtor lowering

Currently AMDGPU offers extra ctor / dtor lowering by emitting a kernel
that can be called. It's possible to handle ctors and dtors using the
standard method as shown in D149340's commit message. In which case we
on't need these extra kernels as they won't be called. This patch simply
adds a way to conditionally turn off this handling if we do not want to
get extra kernels in the output.

Unrelated, but we could convert this handling to an ODR function that simply
calls the code in D149340 constructed via LLVM-IR. That would handle priority
correctly and would then be correct if not run in LTO mode.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D150565

Added: 
    

Modified: 
    llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
    llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 9da225366745..5df136c8ca8c 100644

--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -189,6 +189,11 @@ OptExecMaskPreRA("amdgpu-opt-exec-mask-pre-ra", cl::Hidden,
             cl::desc("Run pre-RA exec mask optimizations"),
             cl::init(true));
 
+static cl::opt<bool>
+    LowerCtorDtor("amdgpu-lower-global-ctor-dtor",
+                  cl::desc("Lower GPU ctor / dtors to globals on the device."),
+                  cl::init(true), cl::Hidden);
+
 // Option to disable vectorizer for tests.
 static cl::opt<bool> EnableLoadStoreVectorizer(
   "amdgpu-load-store-vectorizer",
@@ -968,7 +973,8 @@ void AMDGPUPassConfig::addIRPasses() {
   disablePass(&PatchableFunctionID);
 
   addPass(createAMDGPUPrintfRuntimeBinding());
-  addPass(createAMDGPUCtorDtorLoweringLegacyPass());
+  if (LowerCtorDtor)
+    addPass(createAMDGPUCtorDtorLoweringLegacyPass());
 
   // A call to propagate attributes pass in the backend in case opt was not run.
   addPass(createAMDGPUPropagateAttributesEarlyPass(&TM));

diff  --git a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll
index d090d200953c..a8eec5cbf196 100644
--- a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll
+++ b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll
@@ -5,6 +5,7 @@
 ; RUN: opt -S -mtriple=amdgcn-- -passes=amdgpu-lower-ctor-dtor,amdgpu-lower-ctor-dtor < %s | FileCheck %s
 ; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -filetype=obj -o - < %s | llvm-readelf -s - 2>&1 | FileCheck %s -check-prefix=VISIBILITY
 ; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -filetype=obj -o - < %s | llvm-readelf -S - 2>&1 | FileCheck %s -check-prefix=SECTION
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -amdgpu-lower-global-ctor-dtor=0 -mcpu=gfx700 -filetype=obj -o - < %s | llvm-readelf -s - 2>&1 | FileCheck %s -check-prefix=DISABLED
 
 @llvm.global_ctors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @foo, ptr null }]
 @llvm.global_dtors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @bar, ptr null }]
@@ -49,6 +50,10 @@
 ; VISIBILITY: OBJECT   WEAK DEFAULT {{.*}} amdgcn.device.fini.kd
 ; SECTION: .init_array.1     INIT_ARRAY      {{.*}} {{.*}} 000008 00  WA  0   0  8
 ; SECTION: .fini_array.1     FINI_ARRAY      {{.*}} {{.*}} 000008 00  WA  0   0  8
+; DISABLED-NOT: FUNC   GLOBAL PROTECTED {{.*}} amdgcn.device.init
+; DISABLED-NOT: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.init.kd
+; DISABLED-NOT: FUNC   GLOBAL PROTECTED {{.*}} amdgcn.device.fini
+; DISABLED-NOT: OBJECT   GLOBAL DEFAULT {{.*}} amdgcn.device.fini.kd
 
 define internal void @foo() {
   ret void